Paul King
Dr Paul King has been contributing to open source projects for nearly 30 years and is an active committer on numerous projects including Groovy, GPars and Gradle. Paul speaks at international conferences, publishes in software magazines and journals, and is a co-author of Manning’s best-seller: Groovy in Action, 2nd Edition.
Groovy and Data Science
Groovy is a powerful multi-paradigm programming language for the JVM that offers a wealth of features that make it ideal for many data science and big data scenarios.
Groovy has a dynamic nature like Python, which means that it is very powerful, easy to learn, and productive. The language gets out of the way and lets data scientists write their algorithms naturally.
It has a static nature like Java and Kotlin, which makes it fast when needed. Its close alignment with Java means that you can often just cut-and-paste the Java examples from various big data solutions and they’ll work just fine in Groovy.
And it has first-class functional support, meaning that it offers features and allows solutions similar to Scala. Functional and stream processing with immutable data structures can offer many advantages when working in parallel processing or clustered environments.
These slides review the key benefits of using Groovy to develop data science solutions, including integration with various JDK libraries commonly used in data science solutions including libraries for data manipulation, machine learning, plotting and various big data solutions for scaling up these algorithms.
Math/Data Science libraries covered include:
Weka, Smile, Apache Commons Math, beakerx notebooks, Deep Learning4J.
Libraries for scaling/concurrency include:
Apache Spark, Apache Ignite, Apache MXNet, GPars, Apache Beam.
An introduction to Property-based testing
Property-based testing is an approach to testing that involves checking that a system meets certain expected properties. The approach is frequently promoted as a desired technique when adopting a functional style of programming. It typically involves guiding the generation of large data sets using a generator framework which can be much less work than coding large test suites by hand. This talk looks at the concepts behind this approach and some of the available libraries. The examples are mostly in Groovy but should be easily ported to other JVM languages. The concepts are applicable across all languages.
Groovy and Data Science
Groovy is a powerful multi-paradigm programming language for the JVM that offers a wealth of features that make it ideal for many data science and big data scenarios.
Groovy has a dynamic nature like Python, which means that it is very powerful, easy to learn, and productive. The language gets out of the way and lets data scientists write their algorithms naturally.
It has a static nature like Java and Kotlin, which makes it fast when needed. Its close alignment with Java means that you can often just cut-and-paste the Java examples from various big data solutions and they’ll work just fine in Groovy.
And it has first-class functional support, meaning that it offers features and allows solutions similar to Scala. Functional and stream processing with immutable data structures can offer many advantages when working in parallel processing or clustered environments.
These slides review the key benefits of using Groovy to develop data science solutions, including integration with various JDK libraries commonly used in data science solutions including libraries for data manipulation, machine learning, plotting and various big data solutions for scaling up these algorithms.
Math/Data Science libraries covered include:
Weka, Smile, Apache Commons Math, beakerx notebooks, Deep Learning4J.
Libraries for scaling/concurrency include:
Apache Spark, Apache Ignite, Apache MXNet, GPars, Apache Beam.