Recently I was working on a sample project link. The objective of this project was to consume a list of user feedbacks in a given period of time, derive some information out of it and display it in a meaningful format. What I did was to fetch all the data in a selected range and then find an average rating and number of user who gave that rating over a periodic interval. Then show this information a line chart using iOS Charts library.
This solution seems to be simple and meaningful. But I started seeing a few code smell problem right away. Applications with data analysis heavy task do tend to have lots of queries to derive information from a set of data. If these queries are not well structured and composable we can land into a lot of problems later and hard to find bugs.
So I thought to take a look if any other language has a better way to organize and write these complex data queries in a better way. I stumbled upon the LINQ approach in C#.
LINQ
LINQ is an acronym for Language Integrated Query, which is descriptive for where it’s used and what it does. With the help of LINQ, one can write descriptive, composable and lazy queries.
So my objective was to somehow write queries in Swift with following properties:-
- Descriptive
- Composable
- Lazy
- Data Agnostic
Let’s understand how to achieve this with a simple example.
Problem Statement
Given a list of feedback for products get group data per day per product type in a given period of time.
Input Data:-
{
"ProductList": [{
"name": "Product 1",
"rating": 3,
"createdDate": 1557053048
},
{
"name": "Product 2",
"rating": 1,
"createdDate": 1557053058
},
{
"name": "Product 1",
"rating": 4,
"createdDate": 1557053048
},
{
"name": "Product 2",
"rating": 3,
"createdDate": 1557053058
}
]
}
Output Data:-
After processing the above sample Data. It should look like this:-
{
"1557053048": [{
"Product 1": [{
"name": "Product 1",
"rating": 3,
"createdDate": 1557053048
},
{
"name": "Product 1",
"rating": 4,
"createdDate": 1557053048
}
]
}],
"1557053058": [{
"Product 2": [{
"name": "Product 2",
"rating": 1,
"createdDate": 1557053058
},
{
"name": "Product 2",
"rating": 3,
"createdDate": 1557053058
}
]
}]
}
One possible solution
Let’s first check out the easy solution. Here is the code for that:-
Model:-
public struct Products {
// MARK: Properties
public var items: [Product]
}
public struct Product {
// MARK: Properties
public var name: String
public var createdDate: Int64
public var rating: Int
}
Common GroupBy Util:-
public extension Sequence {
public func groupBy<T: Hashable>(_ keyPath: KeyPath<Element, T>) -> [T: [Iterator.Element]] {
var results = [T: Array<Iterator.Element>]()
forEach {
let key = $0[keyPath: keyPath]
if var array = results[key] {
array.append($0)
results[key] = array
}
else {
results[key] = [$0]
}
}
return results
}
}
Usecase for Query:-
public struct ProductQueryUsecase {
func getProductGroupByCreatedDateAndProduct(products: Products, between: (startDate: Int64, endDate: Int64)) -> [Int64: [String: [Product]]] {
// filter list of products between these dates
return products.items.filter { (item) -> Bool in
return (between.startDate <= item.createdDate && item.createdDate < between.endDate)
}.groupBy(\Product.createdDate).mapValues { $0.groupBy(\Product.name) }
}
}
The above solution looks fine. getProductGroupByCreatedDateAndProduct function works perfectly fine. It can be unit tested also. Then what is the problem?
This solution looks ok and has no issue. But it cannot scale. What I mean is in an application with data analysis we often need to create custom queries. Building these queries from the smaller unit should be simple, composable and descriptive. So now we will try to write a solution which has smaller components. These components can be composed, should be self-descriptive and lazy.
QueryComponent Explained
So let’s first ask how does the smallest component of the query look like. Here is how it looks like:-
The smallest unit of a query is a queryComponent which has 2 inputs:-
Input Data — This is the data over which query should run
Query — This is the strategy for the query that needs to be passed ex:- in a groupBy query it is the keyPath over which input Data should be grouped.
As a result of running this Query Component with a set of input data will result in final output Info. Now how can we compose these Query Component?
To build a composable system the thumb-rule is that the output of the first function should be of the same type as input of the second function this way we can compose 2 functions.
So now to achieve the same function signature we need to create a QueryComponent with 1 input data and 1 output info. This means we need to provide the query input to the QueryComponent first and the combine the same with the next QueryComponent pushing Input Data and corresponding output info to the outer most layer. This is how it looks now:-
Implementing Final Solution
To achieve we can use Reader Monad. To know more about Reader Monad please check out my previous blog here.
I choose Reader Monad for 2 main reasons:-
It helps to provide a type which can be composed
It captures the functionality and specifies the dependency that is required to run that functionality in type signature itself.
We have used following implementation of Reader Monad:-
public struct Reader<E, A> {
let f: (E) -> A
static func unit<E, A>(_ a: A) -> Reader<E, A> {
return Reader<E, A>{_ in a}
}
func run(_ e: E) -> A {
return f(e)
}
func map<B>(_ g: [@escaping](http://twitter.com/escaping) (A) -> B) -> Reader<E, B> {
return Reader<E, B> { e in g(self.run(e)) }
}
func flatMap<B>(_ g: [@escaping](http://twitter.com/escaping) (A) -> Reader<E, B>) -> Reader<E, B> {
return Reader<E, B> { e in g(self.run(e)).run(e) }
}
}
precedencegroup LeftApplyPrecedence {
associativity: left
higherThan: AssignmentPrecedence
lowerThan: TernaryPrecedence
}
infix operator >>= : LeftApplyPrecedence
infix operator >>>= : LeftApplyPrecedence
infix operator >>=> : LeftApplyPrecedence
func >>= <E, A, B>(a: Reader<E, A>, f: [@escaping](http://twitter.com/escaping) (A) -> Reader<E, B>) -> Reader<E, B> {
return a.flatMap(f)
}
func >>>= <E, A, B>(a: Reader<E, A>, f: [@escaping](http://twitter.com/escaping) (A) -> B) -> Reader<E, B> {
return a.map(f)
}
func >>=> <E, A, B, C>(a: Reader<E, [A: B]>, f: [@escaping](http://twitter.com/escaping) (B) -> [C: B]) -> Reader<E, [A: [C: B]]> {
return a.map { $0.mapValues(f) }
}
/// Pipe forward | Applies an argument on the left to a function on the right.
infix operator |> : LeftApplyPrecedence
public func |> <A, B> (a: A, f: (A) -> B) -> B {
return f(a)
}
Now let us see the generic implementation of Sort, Filter and GroupBy functions with Reader Monad:-
public extension Sequence {
public func groupBy<T: Hashable>(_ keyPath: KeyPath<Element, T>) -> [T: [Iterator.Element]] {
var results = [T: Array<Iterator.Element>]()
forEach {
let key = $0[keyPath: keyPath]
if var array = results[key] {
array.append($0)
results[key] = array
}
else {
results[key] = [$0]
}
}
return results
}
}
public func sort<T>(by sortFn: [@escaping](http://twitter.com/escaping) (T, T) -> Bool) -> Reader<[T], [T]> {
return Reader { value in
return value.sorted(by: sortFn)
}
}
public func filter<T>(isIncluded predicate: [@escaping](http://twitter.com/escaping) (T) -> Bool) -> Reader<[T], [T]> {
return Reader { value in
return value.filter(predicate)
}
}
public func groupBy<T, R: Hashable>(_ keyPath: KeyPath<T, R>) -> Reader<[T], [R: [T]]> {
return Reader { value in
return value.groupBy(keyPath)
}
}
The above implementation of Sort, Filter and GroupBy function uses Reader Monad to encapsulate the generic functionality and pushing out the Input Data dependencies outside.
Let’s now used these new generic QueryComponent and build our solution.
Final Query Usecase
public struct ProductQueryUsecase {
public static let filterByDate = { (dateRange: (startDate: Int64, endDate: Int64)) -> Reader<[Product], [Product]> in
return filter(isIncluded: { item -> Bool in
return (dateRange.startDate <= item.createdDate && item.createdDate < dateRange.endDate)
})
}
public static let groupByProductType = { () -> Reader<[Product], [String : [Product]]> in
return groupBy(\ Product.name)
}
public static let groupByCreatedDate = { () -> Reader<[Product], [Int64 : [Product]]> in
return groupBy(\Product.createdDate)
}
// Now we can compose query component and achieve bigger functionality
func getProductGroupByCreatedDateAndProduct(between: (startDate: Int64, endDate: Int64)) -> Reader<[Product], [Int64 : [String : [Product]]]> {
return ((between
|> ProductQueryUsecase.filterByDate)
>>>= ProductQueryUsecase.groupByCreatedDate().run
>>=> ProductQueryUsecase.groupByProductType().run)
}
}
As we can see the above code of getProductGroupByCreatedDateAndProduct is implemented by composing smaller functions. This implementation has all the properties we wish to accomplish that is Declarative, C*omposable, **Lazy*.
With this implementation, we can scale and write more complex queries and can be achieved by combining smaller query component.