Sunday, 30 October 2011

LINQ in C++


What is LINQ?


LINQ stands for Language Integrated Query - a .NET component which comes with .NET 3.5 and above. It provides facilities for querying data, in a very elegant and programmer-friendly manner. For example, if we have a list of persons, and we want to filter all teenagers and then sort them by their name (alphabetically), then without LINQ we would do this task in the following way probably:
var results = new List<Person>();
foreach(Person p in persons)
{
    if ( p.Age >= 13 && p.Age <= 19 ) //by definition of teenager!
        results.Add(p);
}
results.Sort(new NameComparer());
where NameComparer is a class defined as:
public class NameComparer  : IComparer<Person>
{
    public int Compare(Person x, Person y)
    {
        return String.Compare(x.Name, y.Name);
    }
 }
That is too much of coding which is one of the drawbacks of imperative programming style; the programmer has to tell how the task should be done, rather than what task should be done. Well, I admit that an experienced .NET programmer might make it a bit shorter. Lets give it try: instead of a full-fledged class NameComparer, if we could use lambda to compare the items, then the entire task would reduce to this:
var results = new List<Person>();
foreach(Person p in persons)
{
    if ( p.Age >= 13 && p.Age <= 19 )
        results.Add(p);
}
results.Sort((x,y) => String.Compare(x.Name, y.Name));
That is a lot of improvement, but still not as elegant as the one which uses LINQ. Or can it be even shortened? I've no idea. See how LINQ version looks like:
 var results = persons.Where(p=> p.Age >= 13 && p.Age <= 19).OrderBy(p=> p.Age); 
Cool, isn't it?

Working towards C++ syntax


Now, the question is, can we write this query in C++ in a similar way? First and foremost, although the new Standard C++11 have introduced lambda into the language, it doesn't have any feature like Extension Methods without which LINQ in C# cannot be even imagined. Without extension methods, the above query would look like this:
var results = Enumerable.OrderBy(Enumerable.Where(persons, p=> p.Age >= 13 && p.Age <= 19), p=> p.Age); 
which looks pretty much ugly. Extension methods and lambda expressions are heart of LINQ. So as noted earlier, while C++11 have lamda, it doesn't have extension method, then how are we going to implement LINQ in C++? If we implement where, orderby, select as free functions, and we don't do any workaround for extension methods, then in C++, we can write query of this form only:
auto tmp1 = where(entities, lambda1); //lambda1 is some lambda expression
auto tmp2 = select(tmp1, lambda2); //lambda2 is some lambda expression
auto results = orderby(tmp2, lambda3); //lambda3 is some lambda expression
Not so elegant. In fact it is a shame, even if we make it short as,
auto results = orderby(select(where(entities, lambda1), lambda2), lambda3);
Little bit better, but still not elegant. We need to look for a way so that we can write the query as something which should look like this, or close to it:
auto results = entities.where(lambda1).select(lambda2).orderby(lambda3);
Those who have worked in Linux are probably familiar with pipe. Pipe allows one to combine multiple commands, in such a way that output of one command feeds as input to the next command, and so on. For example,
[snawaz@localhost]$ cat file.txt | wc 
      2     13     102
In this example, pipe combines two commands cat and wc, separated by | which is called pipe-character. The command cat displays the content of the input file, and wc prints the number of line, word and character in the input. Since cat has been combined with wc, the output of  cat feeds as input to wc, and therefore wc prints 2 13 102 as the number of line, word and character respectively, on the console which is final output of the pipeline. If we write this,
[snawaz@localhost]$ cat file.txt | wc | mail -s "Lines, Words, Character" snawaz@example.com
Now, cat outputs the content of file.txt which feeds as input to wc which outputs the count of line, words, and characters, and which in turn feeds as input to mail command, which simply sends a mail to snawaz@example.com, the subject of mail is Lines, Words, Character and the content of the mail is 2 13 102.

Coming back to LINQ, if we see carefully, there is a similarity between a typical linq query and linux pipelining; in both, the output of one command/query feeds as input to the next. That means, if we overload the operator | in C++, then we could write,
auto results = entities | where(lambda1) | select(lambda2) | orderby(lambda3);
That looks quite interesting syntax. Now all we need to implement where, select and orderby in such a way that we could make it work without changing the syntax at all.

Implementation and Examples


So here you go. I've implemented linq in C++, the following code snippet shows some example of its usage.
int main() 
{
     using namespace linq;
 
     //create a vector of persons and print them
     std::vector<person> persons = { 
         {1, "Aleeza"}, {3, "Nayab"}, {13, "Ashraf"}, {16, "Adil"}, 
         {20, "Ahsan"}, {21, "Zartab"}, {17, "Nadim"}, {19, "Nafis"}, 
         {13, "Neha"}, {15, "Zeeshan" }, {11, "Farah"}, {13, "Fauzia"},
         {23, "Nilufer"}, {24, "Talat"}, {7, "Tufail"}, {8, "Sana"} 
     };
     std::cout << "\tOriginal\n" << persons << std::endl;

     //orderby age and print the result 
     auto v1 = persons | orderby([](const person & p) { return p.age; });
     std::cout << "\tOrder by Age\n" << v1 << std::endl;

     //filter all teenagers then orderby name, and print the result
     auto v2 = persons 
               | where([](const person & p) { return p.age >=13 && p.age <= 19 ; }) 
               | orderby([](const person & p) { return p.age; });
     std::cout << "\tFilter Teenagers, and Order by Age\n" << v2 << std::endl;

     //filter all teenagers then orderby name, and print the result
     auto v3 = persons 
               | where([](const person & p) { return p.age >=13 && p.age <= 19 ; }) 
               | orderby([](const person & p) { return p.name; });
     std::cout << "\tFilter Teenagers, and Order by Name\n" << v3 << std::endl;

     //groupby kids, teenagers, adults and print the result
     auto groups = persons | groupby([](const person & p) 
                   {  
                       if ( p.age < 13 ) return "Kids";
                       else if ( p.age < 20 ) return "Teenagers";
                       else return "Adults";
                   });
     std::cout << "\tGroupby Maturity" << std::endl;
     //Note that the type of groups is std::map!
     for(auto i = groups.begin(); i != groups.end(); i++ )
     {
           std::cout << i->first << std::endl;
           std::cout << i->second << std::endl;
     }
}
Operator `<<` has been overloaded for `person` and `vector`, so the output of above code looks like this:
         Original
(1,Aleeza)
(3,Nayab)
(13,Ashraf)
(16,Adil)
(20,Ahsan)
(21,Zartab)
(17,Nadim)
(19,Nafis)
(13,Neha)
(15,Zeeshan)
(11,Farah)
(13,Fauzia)
(23,Nilufer)
(24,Talat)
(7,Tufail)
(8,Sana)

        Order by Age
(1,Aleeza)
(3,Nayab)
(7,Tufail)
(8,Sana)
(11,Farah)
(13,Ashraf)
(13,Neha)
(13,Fauzia)
(15,Zeeshan)
(16,Adil)
(17,Nadim)
(19,Nafis)
(20,Ahsan)
(21,Zartab)
(23,Nilufer)
(24,Talat)

         Filter Teenagers, and Order by Age
(13,Ashraf)
(13,Neha)
(13,Fauzia)
(15,Zeeshan)
(16,Adil)
(17,Nadim)
(19,Nafis)

         Filter Teenagers, and Order by Name
(16,Adil)
(13,Ashraf)
(13,Fauzia)
(17,Nadim)
(19,Nafis)
(13,Neha)
(15,Zeeshan)

          Groupby Maturity
Kids
(1,Aleeza)
(3,Nayab)
(11,Farah)
(7,Tufail)
(8,Sana)

Teenagers
(13,Ashraf)
(16,Adil)
(17,Nadim)
(19,Nafis)
(13,Neha)
(15,Zeeshan)
(13,Fauzia)

Adults
(20,Ahsan)
(21,Zartab)
(23,Nilufer)
(24,Talat)
The complete implementation of linq in C++, along with the test code, is uploaded at github. Have a look at linqcpp. Enjoy!

Sunday, 21 August 2011

Playing around with Boost.Spirit - Parsing integer triplets

I'm exploring Boost.Spirit and while doing so, I'm trying to parse integer triplets for input of the following form:
(8,7,15)
(0,0,1) (0,3,2) (0,6,3)
(1,0,4) (1,1,5)

Thursday, 9 June 2011

Why copy-semantics of stream classes from Standard Library are disabled?

C++ Standard IO Library defines several  stream classes, such as std::istream, std::ostream, std::iostream, std::ifstream, std::ofstream, std::fstream, std::istringstream, std::ostringstream, std::stringstream, and their wide-character counterparts.  Copy-semantic of all these streams has been disabled by having made the copy constructor and copy-assignment private, which means one cannot make a copy of stream object:

Tuesday, 7 June 2011

Religion of C

Taken from here
Oh Ken, thou who art the Root of Roots, hallowed be thy name. May you this day grant me my daily malloc. May my functions never return errors, and my error handlers all be perfect. Let us walk in the light, the light of C. Amen.

In the Truth of C, there are no segmentation faults. Only those programs who stray from the Path of C may enter the land of darkness, where Undefined Behaviour dwells and Demons do fly out of Noses.

Yeah, though I walk through the cluster of exceptions, I shall not prematurely optimize, and will live forever in blissful ignorance of the tail recursion optimization I do not need.

The Path is Narrow and Winding, and lo, it is Hard to Follow, for you will travel at great speed. Many are those who have strayed, and few are those who have kept to the Path. Hallowed be their names, for they will live in the Light of C forever!

Saturday, 16 April 2011

Create string on the fly just in one line

We often make use of std::stringstream to convert values of various types into a single std::string. For example, see this:

std::stringstream ss;
ss << 25 << " is greater than " << 5;
std::string s = ss.str(); //s => "25 is greater than 5"

Paul (at stackoverflow) was trying to think of a clever way to concatenate various things into a single string without having to use an ostringstream explicitly. He came up with a macro which he didn't like though.

Sunday, 10 April 2011

Calculating size of arrays - one dimensional and multiple dimensional

Someone asked at stackoverflow : How to get size of an array using metaprogramming?

My solution:

For one dimensional arrays,

template<typename T, size_t N>
size_t size_of_array(T (&)[N])
{
   return N;
}

Elegant ways to tokenize strings


Last night I posted a solution at stackoverflow. The question was : what is the right way to split a string into a vector of strings. Delimiter is space or comma.

This is what I first came up with, for space separated string: