- •Credits
- •About the Authors
- •About the Reviewers
- •www.PacktPub.com
- •Table of Contents
- •Preface
- •Introduction
- •Installing Groovy on Windows
- •Installing Groovy on Linux and OS X
- •Executing Groovy code from the command line
- •Using Groovy as a command-line text file editor
- •Running Groovy with invokedynamic support
- •Building Groovy from source
- •Managing multiple Groovy installations on Linux
- •Using groovysh to try out Groovy commands
- •Starting groovyConsole to execute Groovy snippets
- •Configuring Groovy in Eclipse
- •Configuring Groovy in IntelliJ IDEA
- •Introduction
- •Using Java classes from Groovy
- •Embedding Groovy into Java
- •Compiling Groovy code
- •Generating documentation for Groovy code
- •Introduction
- •Searching strings with regular expressions
- •Writing less verbose Java Beans with Groovy Beans
- •Inheriting constructors in Groovy classes
- •Defining code as data in Groovy
- •Defining data structures as code in Groovy
- •Implementing multiple inheritance in Groovy
- •Defining type-checking rules for dynamic code
- •Adding automatic logging to Groovy classes
- •Introduction
- •Reading from a file
- •Reading a text file line by line
- •Processing every word in a text file
- •Writing to a file
- •Replacing tabs with spaces in a text file
- •Deleting a file or directory
- •Walking through a directory recursively
- •Searching for files
- •Changing file attributes on Windows
- •Reading data from a ZIP file
- •Reading an Excel file
- •Extracting data from a PDF
- •Introduction
- •Reading XML using XmlSlurper
- •Reading XML using XmlParser
- •Reading XML content with namespaces
- •Searching in XML with GPath
- •Searching in XML with XPath
- •Constructing XML content
- •Modifying XML content
- •Sorting XML nodes
- •Serializing Groovy Beans to XML
- •Introduction
- •Parsing JSON messages with JsonSlurper
- •Constructing JSON messages with JsonBuilder
- •Modifying JSON messages
- •Validating JSON messages
- •Converting JSON message to XML
- •Converting JSON message to Groovy Bean
- •Using JSON to configure your scripts
- •Introduction
- •Creating a database table
- •Connecting to an SQL database
- •Modifying data in an SQL database
- •Calling a stored procedure
- •Reading BLOB/CLOB from a database
- •Building a simple ORM framework
- •Using Groovy to access Redis
- •Using Groovy to access MongoDB
- •Using Groovy to access Apache Cassandra
- •Introduction
- •Downloading content from the Internet
- •Executing an HTTP GET request
- •Executing an HTTP POST request
- •Constructing and modifying complex URLs
- •Issuing a REST request and parsing a response
- •Issuing a SOAP request and parsing a response
- •Consuming RSS and Atom feeds
- •Using basic authentication for web service security
- •Using OAuth for web service security
- •Introduction
- •Querying methods and properties
- •Dynamically extending classes with new methods
- •Overriding methods dynamically
- •Adding performance logging to methods
- •Adding transparent imports to a script
- •DSL for executing commands over SSH
- •DSL for generating reports from logfiles
- •Introduction
- •Processing collections concurrently
- •Downloading files concurrently
- •Splitting a large task into smaller parallel jobs
- •Running tasks in parallel and asynchronously
- •Using actors to build message-based concurrency
- •Using STM to atomically update fields
- •Using dataflow variables for lazy evaluation
- •Index
Chapter 10
There's more...
STM is a very powerful tool in the hands of the "concurrent developer". It is important to understand that this power doesn't come cheap. A simple increment of an Integer
variable does indeed provoke many more CPU cycles than the couples required to update a variable in Java. If we run javap -c over the Java bytecode produced by the compilation of the StmValueIncreaser class, we will see a very long Java bytecode list of instructions.
Removing the STM references, the bytecode shrinks dramatically. This shows that the Multiverse STM implementation used by GPars is not lightweight. Nevertheless it's a very efficient paradigm to reason about and implement heavily concurrent systems. In this recipe, we addressed a relatively simple problem that could also have been solved using the JDK's AtomicInteger. However Multiverse supports many more data types and data structures: take a look at the documentation and Javadoc for a deeper understanding of what the framework has to offer.
See also
ff http://multiverse.codehaus.org
ff http://gpars.codehaus.org/STM
ff http://en.wikipedia.org/wiki/Software_transactional_memory
ff http://gpars.org/0.12/javadoc/groovyx/gpars/stm/GParsStm.html
Using dataflow variables for lazy evaluation
Dataflow concurrency is a concurrent programming paradigm that has been around for three decades now. What is so exciting about it?
The main idea behind Dataflow concurrency is to reduce the number of variable assignments to one. A variable can only be assigned a value once in its lifetime, while the number of reads is unlimited. If a variable value is not written by a write operation, all the read operations are blocked until the variable is actually written (bind). With this straightforward, single-assignment approach, it is impossible to access an inconsistent value or experience data race conflicts. The deterministic nature of Dataflow concurrency ensures that it will always behave the same. You can run the same operation 5 or 10 million times the result will always be the same. Conversely, if an operation enters into a deadlock the first time, it will do the same every other time you run it. These qualities make it very easy to reason about concurrency, but it comes at a price: code must be deterministic. Random, time, exceptions, and so on are not allowed. The section of code that employs Dataflow concurrency must act as a pure function, with input and output.
The Groovy's GPars framework exposes this alternative concurrency model, and in this recipe we are going to explore how to solve the problem of high latency when invoking external systems, exposed in the Running tasks in parallel and asynchronously recipe.
365
www.it-ebooks.info
Concurrent Programming in Groovy
Getting ready
For setting up this recipe, please refer to the Getting Ready section of the Running tasks in parallel and asynchronously recipe.
Start the dummy web service using groovy app.groovy.
How to do it...
The following steps expose how to modify the CriminalService class to leverage Dataflow concurrency.
1.Create a new Groovy class named CriminalServiceWithDataflow. package org.groovy.cookbook.dataflow
import static groovyx.gpars.dataflow.Dataflow.task import groovyx.gpars.dataflow.DataflowVariable
class CriminalServiceWithDataflow {
def baseUrl
CriminalServiceWithDataflow(String url) { baseUrl = url
}
}
2.Add a function to retrieve the JSON data for the specified country:
def fetchData(String country) {
println "fetching data for ${country}" def jsonResponse = new DataflowVariable() task {
try { "${baseUrl}/${country}".toURL().openConnection().with
{
if( responseCode == 200 ) { jsonResponse << inputStream.text
}else {
jsonResponse << new RuntimeException( 'Invalid Response Code from HTTP GET:' +
responseCode
)
366
www.it-ebooks.info
Chapter 10
}
disconnect()
}
} catch( e ) { jsonResponse << e }
}
jsonResponse
}
3.Add the main function from which data aggregation is done:
List getData(List countries) { List aggregatedJson = [] countries.each {
aggregatedJson << fetchData(it)
}
aggregatedJson*.val
}
4.To test our new class, let's add a simple test case:
@Test
void testDataflow() {
def serviceUrl = 'http://localhost:5050' def criminalService =
new CriminalServiceWithDataflow(serviceUrl)
def data = criminalService. getData(['germany', 'us', 'canada'])
assert 3 == data.size()
data.each { try {
println it
}catch (e) { e.printStackTrace()
}
}
}
367
www.it-ebooks.info
Concurrent Programming in Groovy
How it works...
The fetchData function of the CriminalServiceWithDataflow class is where the power of Dataflow in action is really visible. The function contains a DataflowVariable named jsonResponse and a task that has the responsibility to populate the variable. This variable can be written only once, through the << operator. The task contains the actual code to access the Criminal Service web service with some simplistic exception handling code. When the value of a DataFlowVariable is read, it will block until the value is set (using <<). In this way, the time required to collect the data for three countries will be equal to the longest response time.
The getData function spans the HTTP requests over 3 threads. Note that the fetchData method is not blocking. The blocking takes place only in the last line of the getData method, when the val method is invoked (and therefore the variable read) on each DataflowVariable containing the HTTP GET response.
It's also worth noting how the exception handling is organized. Let's zoom into the code:
def jsonResponse = new DataflowVariable() try {
...
} catch( e ) { jsonResponse << e
}
When an exception occurs inside the task, we assign the Exception to the jsonResponse variable of type DataflowVariable. The DataflowVariable class has two methods to access the stored value:
ff The val method that simply returns the Exception
ff The get method that will rethrow the Exception, if any
Use val or get depending on your exception handling requirements. You can test how the exception handling works, by shutting down the Ratpack server or passing invalid countries that will yield a 404 response code.
368
www.it-ebooks.info
Chapter 10
There's more...
Dataflow concurrency is a very elegant paradigm, and there are more concepts in this model than the one expressed in this recipe. The best way to learn them is head to the official GPars Dataflow documentation located at the following link: http://www.gpars.org/guide/
guide/dataflow.html
See also
ff http://en.wikipedia.org/wiki/Dataflow
ff http://gpars.org/0.12/javadoc/groovyx/gpars/dataflow/ DataflowVariable.html
369
www.it-ebooks.info
www.it-ebooks.info