CSV export of Ecto models

This post is about converting Ecto models to a csv file using this CSV library

Ecto is a database DSL for Elixir and is part of the Phoenix framework. Recently we had a requirement to export one of the tables as CSV (you can never escape such export requirement as long as there is excel right?)

Here is an ecto model we wanted to export

  defmodule Message do
    use MyApp.Web, :model

    schema "messages" do
      field :message, :string
      field :status, :string
      field :uuid, Ecto.UUID
      timestamps()
    end
  end

It is a pretty simple model backed by table called “messages” which has the columns message, status and uuid

Get the data to be exported from the database

...
messages = Message order_by(:inserted_at) |> Repo.all
...

And trying to convert it to CSV

messages
  |> CSV.encode
  |> Enum.to_list

does not work because the CSV encoder does not know how to deal with the Message type. So, define an encoder function for the Message type

defmodule Message do
  ...
  defimpl CSV.Encode, for: Message do
    def encode(cm, env \\ []) do
      [cm.message, cm.status, cm.uuid]
        |> Enum.map(fn(v) -> CSV.Encode.encode(v, env) end)
        |> Enum.join(",")
    end
  end  
end

We are creating an array of all the needed column values and encoding them individually. Then, joining them to form the CSV for the given message struct

But this is not enough. CSV.encode/2 expects a stream of data in a tabular format, and encodes it to RFC 4180 compliant CSV lines. By that it means the data should be in the format

[
  ["row1-col1-data","row1-col2-data","row1-col3-data"],
  ["row2-col1-data","row2-col2-data","row2-col3-data"]
]

So, wrap the array of Ecto Message objects into an array of array having a message object per row

[
  [message1],
  [message2]
]

Do that and pass it to the csv encoder

...
messages = Message order_by(:inserted_at) |> Repo.all
  |> Enum.map(fn(m) -> [m] end)
messages
  |> CSV.encode
  |> Enum.to_list
...

The resulting string will be the CSV of Ecto message instances

Rate Limiting the right way

While dealing with external services you would want to throttle requests. In Go, its pretty simple to implement it with something like this.

While this works for tens of operations a second, this does not scale for 1000s of operations a second. We have been working with an AMQP queue that needs to be drained with back pressure. Our downstream system is sending SMSs using a gateway and can only send a prescribed amount of SMSs but it is greater than tens of operations a second. There is a supplementary time package which provides rate limiting that works with token buckets.

The token bucket is an algorithm used in packet switched computer networks and telecommunications networks. It can be used to check that data transmissions, in the form of packets, conform to defined limits on bandwidth and burstiness (a measure of the unevenness or variations in the traffic flow). It can also be used as a scheduling algorithm to determine the timing of transmissions that will comply with the limits set for the bandwidth and burstiness. – Wikipedia

So you start off by creating a limit which defines the interval. For example, if you would like for 2000 events a minute.

You would then create a limiter that would setup the burst rate per second. So in our case you would want something like this.

Then you can reserve a token and sleep for the duration indicated by the reservation.

You can verify if things are going fine by printing the value every second.

You can also reserve multiple tokens at a time and do as many operations. This will ensure that the scheduling overhead on the goroutine is avoided.

Customizing create_react_app without ejecting

As you might be aware the create_react_app is all the new hotness these days. While it eliminates the need for complex boilerplate, it leaves little room to for customization. We are in a weird fix where we have to use our react app within a Rails view. The layout belongs to Rails but we want to host the react app within it. However, the create-react-app assumes (rightly so) that it would be used in an independent static project.

I would like to use that without running an npm run eject on the project. All we needed was to generate a manifest as a part of the build so that we can use that to change references to the CSS and JS tags in the layout. Fortunately there exists a webpack plugin called the webpack-manifest-plugin for doing precisely that. So we wrote a build script that uses that and placed it in script/build.js within the react repo.

process.env.NODE_ENV = 'production'

var webPackProd = require('react-scripts/config/webpack.config.prod');
var ManifestPlugin = require('webpack-manifest-plugin');

webPackProd.plugins = [new ManifestPlugin()].concat(webPackProd.plugins);

require('react-scripts/scripts/build');

We then changed package.json to run this script instead of the standard react-scripts build for the npm run build command.

{
  "name": "newreactapp",
  "version": "0.0.1",
  "private": true,
  "devDependencies": {
    "react-scripts": "0.6.1"
  },
  "dependencies": {
    "react": "^15.3.2",
    "react-dom": "^15.3.2",
    "webpack-manifest-plugin": "^1.0.1"
  },
  "scripts": {
    "start": "react-scripts start",
    "build": "node ./scripts/build.js",
    "test": "react-scripts test --env=jsdom",
    "eject": "react-scripts eject"
  }
}

Voila. Now when we have a manifest.json in our build directory that looks a bit like this.

{
  "main.css": "static/css/main.9a0fe4f1.css",
  "main.css.map": "static/css/main.9a0fe4f1.css.map",
  "main.js": "static/js/main.247a9e3c.js",
  "main.js.map": "static/js/main.247a9e3c.js.map",
  "static/media/logo.svg": "static/media/logo.5d5d9eef.svg"
}

Of course this is a hack and would not work very well if the structure of the template changes. I would really like more knobs and dials to configure create-react-app than the current all or nothing approach.

Integrating React Native into an existing Swift project

React Native is really nice. Its Cmd-R style development is refreshing compared to XCode’s build, deploy and run cycle. But if you want to use things like CoreData, NSOperationQueues or any of the other awesome libraries in the iOS world, you have to write bridges. In this blog post we will be focusing on adding React Native to an existing swift application. React Native’s secret is in its RCTRootView. So just like for the browser, React focuses on being a good view library and can be mixed with your existing swift application.

We’ll be using the XCode’s Single View Controller Swift project template for demonstration. React Native’s site has excellent documentation how you would go about such an integration. I add a few more things on top like loading from compiled / hosted JSBundle, caching the JSBundle and manually reloading React code after refreshing the cache. You can download the sample code from Github.

After you have your project and set it up with cocoapods, go ahead and initialize your package.json. My package.json looks like this. I have placed my package.json in the same location as my .xcodeproj and .xcworkspace.

Then go ahead and run npm install on the command line to install the necessary dependencies. This will create a node_modules folder. You can add this folder to your .gitignore file much like Pods directory for cocoapods.

You would then have to add React and its subspecs to your Podfile. My Podfile looks like this.

This adds references to the React Native pods and links them to your project as frameworks. Don’t forget to pod install after this. You can then add the entry point to the react native view. This file will be called index.ios.js by convention and we will be referring to it from our ViewController.

You can then start the react-native server by running npm start at the command line from the folder that has the file package.json. This should start a server that should be able to serve our react native bundle from localhost:8081.

I have used a simple View controller which has two views a RCTRootView that hosts the React Native app and a UIButton that will let me reload the code using the RCTBridge object of the RCTRootView. This is what my ViewController code looks like.

Running the app should present the app with React Native view loaded.

Bridging Swift views to React Native

Now this turned out to be trickier than I originally thought it would. React Native has a RCTViewManager class that expects you to return a View. Now the view can be configured with properties and you would have to export view using Obj-C macros. For the purposes of illustration, I am going to create a simple view that has a label and takes in property called message. It would then simply display the value of the message in a label concatenated with the word “Swifted”. Yeah… i made that nonsensical term up just now. I am going to create a view called CustomRNView.

Notice that I am exposing the swift class to the ObjC world using the @objc annotation. Now I have to write a RCTViewManager class that returns me this view.

This class CustomRNViewManager simply has a view method that returns a CustomRNView object. Also note that this class has also been annotated with the @objc annotation.

Now we have to expose it to the react world with the the RCT_EXTERN_MODULE and RCT_EXPORT_VIEW_PROPERTY macros. You will have to create the CustomRNView.h and CustomRNView.m files. The header declares our custom UIView as a RCTView and the implementation will export our CustomRNViewManager.

You do not have to put anything in the bridging header as we are loading rest of the React Native specs as frameworks.

Now, we have to get our CustomRNView to our javascript side. You can now use the requireNativeComponent method to refer to the exported module.

It should now display the CustomRNView within the react native view.

Caching the jsbundle

For the performance reasons, you might want to distribute the bundle with your app or better still load it the first time from a URL and then cache it to your documents directory. When you want to refresh then you can redownload the bundle and reload the code via RCTBridge. I have done precisely that in my ViewController.swift.

This will download the the file once and load it from the documents directory. You will also notice that the app launch time is significantly faster. You might want to consider using services like Microsoft CodePush to automatically download the latest code as you push changes to your code. We are very excited with this approach as we can be more agile with our existing swift apps by leveraging React Native and its code push strategies without having to go through the app compile, deploy and run cycles. Let us know your thoughts and opinions.

Refactoring Swift

We are building a mobile app for enterprise asset management. One of the methods in our SyncManager class had a very complex merge method that merged the items in core data and the items received from the server. To give you a perspective this is what the method looked like.

This method takes in sorted business objects (Work Orders) from core data as the first argument and sorted JSON array on work order id as the second argument and does swift’s equivalent of pointer increments to merge these lists together. The code has two vars and three while loops. While the code works per se, reasoning about a bug by say forgetting to increment a pointer is very hard.

The goal for our refactor is zero vars and a functional way to subtract the lists based on object keys. This is what the transformed functional code looks like.

By most measures, the refactored code looks a bit better than the original. We have removed a lot of complexity from various branching and looping and have made the interface consistent and readable.

Swift does not provide the subtract and indexUniqueOn methods. I wrote a bunch of functions for extending the SequenceType.

The subtract method takes in another collection and two lambdas that lets it arrive on the common key. This lets us subtract items of one sequence based on the one of the derived keys of the other sequence. The Swift’s type system lets us set constraints on generics that help us express this beautifully. Swift’s type system is actually quite nice. And this is what the tests for the extension look like.

Experience Report – Introducing an experienced Ruby Developer to Clojure

I recently introduced Steve to clojure and wanted a good example to show the power of clojure. Steve is a lead developer for our client and has has worked extensively with Ruby and values the expressiveness and the productivity of Ruby. He also has a working knowledge of Java and is interested in clojure but not very convinced about its utility and is generally wary of lisps. I wanted to build an example that demonstrates the syntax, the expressiveness and the concurrency features of clojure. I also wanted to highlight some aspects of the language like threading, function composition and so on which are either absent from Ruby or make it hard to express.

I have tried introducing clojure to other people and have favored the approach of short examples like the ones on Clojure docs. For example, I always used a (Thread/sleep 1000) to demonstrate a long running job for pmap. In fact, clojure docs on pmap uses a similar example. While this convinces the people on why such concurrent systems are important, they fail to see how it can be useful in a real world scenario.

So this time, instead of using several short academic examples, I decided to write a simple program that mimics a real world scenario that can leverage concurrency. The problem I chose to demonstrate was to make a API call to reddit to get top content on the clojure subreddit and do a count of articles by domain and extract the titles of the articles using html scraping using enlive. This problem is not too complex at the same time its not as trivial as (Thread/sleep 1000).

We started off by getting the reddit feed. I initially used slurp for its instant gratification value and showed that this was similar to Ruby’s open method in the open-uri library. Reddit performs rate limiting on API requests and determines it based on the User-Agent header. So I used (clj-http)[https://github.com/dakrone/clj-http] to fetch the JSON with a different user agent.

(require '[clj-http.client :as client])
(require '[clojure.data.json :as json])

(def ^:const REDDIT-URL "http://reddit.com/r/clojure.json?limit=100")
(def ^:const headers {:headers {"User-Agent" "showoffclojure.core by vagmi"}})

(def parse-json
  (comp #(json/read-str % :key-fn keyword)
        :body
        #(client/get % headers)))

(parse-json reddit-url)

I wanted to demonstrate the functional nature of clojure. This demonstrates how simple functions can be composed using comp to higher level functions and how they can be invoked. This also shows how keywords are behave as functions in clojure and are quite prevalent when dealing with maps. I also briefly talked about the IFn interface.

(let [reddit-data (-> (parse-json REDDIT-URL)
                      (:data)
                      (:children))]
  (map :data reddit-data))

To extract the data out of the returned json, I used the -> threading macro. Although, I mentioned macros, I did not spend too much time on them in the interest of brevity.

(require '[net.cgrand.enlive-html :as html])

(defn fetch-url [url]
  (-> url
      (client/get headers)
      ;; Java interop happens seamlessly
      (java.io.StringReader.)
      (html/html-resource)))

(defn extract-title [url]
  (try
    (first (map html/text
                (-> (fetch-url url)
                    (html/select [:html :head :title]))))
    ;; Exception handling is as easy as this
    (catch Exception e "unknown")
    (catch Error e "Unknown")))

We then wrote functions to extract the title of the page using enlive and I introduced how easy it was to do Java interop using Clojure.

;; this will have a structure like
;; {"infoq.com" 3 "self.Clojure" 4}
(defonce domain-summary (ref {}))

;; this will have the following keys
;; [{:article_url, :reddit_title, :actual_title}]
(defonce article-details (ref []))

(defn update-summary [entry]
  (let [title (extract-title (:url entry))
        domain-count (@domain-summary (:domain entry))]
    (dosync
      (if domain-count
        (alter domain-summary assoc (:domain entry) (inc domain-count))
        (alter domain-summary assoc (:domain entry) 1))
      (alter article-details conj {:url (:url entry)
                                   :reddit-title (:title entry)
                                   :actual-title title}))
    (print ".")
    (flush)))

I then talked a bit about how clojure handles values vs identities and its opinions on managing state in a program. We decided to have two data structures domain-summary and article-details. domain-summary will be a map that will have the domain as the key and the count of the entries from the api as the value and the article-details will be a vector of map having the url, reddit-title and actual-title of the page. For every entry in the reddit api list, we will have to update both the domain-summary and the article-details structures. This segued nicely into STM and how clojure supports STM through refs. We put pretty . to indicate the progress and also to point out that IO operations should stay out of dosync blocks.

(let [reddit-data (-> (parse-json REDDIT-URL)
                      (:data)
                      (:children))]
  (map (comp update-summary :data) reddit-data))

We then changed the previous map expression to extract the summary as well. Now the part that completely blew his mind was when I changed map to pmap and it magically utilized all the cores performing ~7x faster as I have 8 cores on my machine.

By the end of this session, Steve was quite convinced about using clojure for our production projects and picked up enough clojure over the weekend to port some of our core parts off ruby to clojure to evaluate it on a real word scenario. I also helped setup vim-fireplace on his machine.

I cleaned up the code a bit and checked it into github. So if you would like to show off clojure to your colleagues or friends, please feel free to use it. If you have any suggestions to make things simpler or demonstrate other concepts that would be helpful in selling clojure’s value proposition better, please send me a PR. I would like to keep the example simple (~ 50 – 60 lines).