





Gwibber is a popular Linux desktop app that you can use for Twitter, Facebook, Friendfeed, and a bunch of other social networking sites. Here’s a brief instruction on how to use it using pygi – the Python GObject Introspection library – instead of pywebkitgtk.
That’s all there is to it.
From a user’s perspective, it doesn’t matter whether you’re using pywebkitgtk or pygi as there are no user-visible changes in doing so. But from a developer’s point of view, using pygi allows one to prototype using the latest copy of WebKitGtk as well as being able to take advantage of APIs that use third-party libraries, in case of WebKitGtk using API that use libsoup. Doing this by hand in pywebkitgtk would take a considerable amount of time and would probably more error-prone.
So if you’re a python-gtk developer looking to embed WebKitGtk in your new application, I would strongly suggest using pygi instead of pywebkitgtk.
n.b. I’m the maintainer of pywebkitgtk and will probably deprecate it once pygi is merged in pygobject, the official python GObject bindings.
Posted in software Tagged: pygi, python, pywebkitgtk


As I get older and *ahem* wiser, I feel the need to constantly change and assess myself and my improvements on a daily or weekly basis. There are an awful lot of things I would like to change personally, but for now I would like to share a few things that I would like to focus on for 2010 in terms of my open source effort.
Last year
During the past year I’ve made significant contributions to a few open source software applications, in particular the WebKitGtk port of the open-source WebKit project, as well as maintaining the state of the pywebkitgtk project. But I suddenly lost interest due to a lot of factors: one of them was basically that eventhough I’ve contributed a significant amount of time and resources to those projects, I’ve realised I hardly learned something new or significant. This is my fault in that I’ve focused too much on things I knew already or things that were easy and not putting enough effort to learn and dive into the unknown. I’ve learned that and next time I get involved, I’ll be wiser.
This year
I would like to get back to developing web apps again and be more proactive in maintaining and developing applications and software that I already have instead of looking for the next fun project to work on. I’ll probably be writing new stuff but it will be solely based on need, not because something that I want.
There are definitely a lot of things to learn from the past year (or two) and hopefully moving forward I won’t be making the same mistakes twice.
Posted in asides


One of the cooler projects I believe out there in the open source world is Clang — a ground-up implementation of a C/C++ front-end for LLVM. Thanks to a status message by Joel Falcou I looked at the cfe-dev mailing list archive and found this:
I decided to see how well clang++ currently does at compiling boost.headers on my linux box. So I took all the files from /usr/include/boost and compiled them. In my test it turns out that clang successfully compiled about 80% of the boost headers.
80% of Boost? That’s not bad at all.
I think Clang as an alternative to GCC would be very welcome indeed not only because I think GCC is inferior but I think it’s time that a C++ community driven C++ compiler implementation under a more liberal license (LLVM Release License) was created. What Clang promises is (according to the website):
The goal of the Clang project is to create a new C, C++, Objective C and Objective C++ front-end for the LLVM compiler.
What they have done so far is short of amazing because to be able to compile part of Boost, your front-end (preprocessor, lexer) ought to be robust enough and your backend (ast generator, code generator) able to handle the stress of template metaprogramming.
Doug Gregor is one of the members of the Clang development team and he responds with the added thought:
Very cool. Just for kicks, I ran the testsuite for Boost.MPL, and we're passing all but 4 tests there. Not bad! To tackle Boost, it's worth starting with the regression tests for the lowest-level libraries and working upward. We're not going to get it all in one shot, and it's best to grow the set of libraries that works over time.
It’s a noble goal to be able to support one of the most challenging libraries as far as pushing C++ compilers is concerned (Boost.MPL is the Metaprogramming Library which pushes all the compilers to the limits for supporting template metaprogramming with C++). Getting all but 4 of the tests to pass for a community-driven, ground-up compiler is nothing short of a great achievement!
I’ll be watching Clang closely and hopefully I’ll be singing it praises for being better than GCC as far as compile time and optimization is concerned. I long to see the day for GCC’s dominance as the open source compiler of choice to be challenged in a serious way — and I think Clang is getting there.
Have you tried Clang yet? What do you think of LLVM?
Posted in community, cool stuff, thoughts


Just after writing a piece about how C++ can be faster than C, I run into this article that speaks about how C is better than C++. Adam Smith writes:
It has been claimed that C++ is a better C them [sic] C. this is being taken to mean that when switching to C++ you can continue to code more or less as she did in C and use a little extra C++ functionality for convenience. The problem with that is that a lot of things which are perfectly safe to do and see are not safe to do while using C++. So here is my list of issues not found in C. You can avoid many of these issues in C++ by limiting what features you use. But you never have any guarantees. You can’t pick up random C++ code, look at it and be certain whether it is doing something safe or not when e.g. statically initializing a variable.
He posts a list of things on his entry which I personally would like to address in this post.
The list is pretty long if you look at his post and seem to be convincing at the outset. He highlights some issues that are known to C++ programmers already that are potentially problematic. Before I go into specific points he raises, I’d like to point out a few high level observations.
First, the post reads more like FUD than anything else. Granted, C has its place in the programming ecosystem — mainly because of legacy and a largely uninformed pool of programmers about the benefits of C++. The personalities surrounding the support for C are quite big and have their own points they raise. This is the first time though that I’ve seen a list of purposely one-sided “propaganda” against C++ in favor of using C.
Next is he does not mention anything about the scalability and maintainability of well written code. He obviously concentrates on the features of C that seem better than C++ on first read, but he fails to account the richness of the C++ language at any point in his article. Of course, it’s all fair and I’m just going to let my side be read.
One more thing is that even after simple digging into the issues he raises you can quickly find an argument to discuss but not refute the claims.
Having said this, let me dive into some of my favorites:
Static initialize is safe in C but not in C++, because in C++ static initialization can cause code to run, which depends on other variables having been statically initialized. It can also cause cleanup code to run at shutdown which you can’t control sequence of (destructors).
This is true but only for non-POD types. Remember that in C++ you can still use POD’s that do not have non-trivial constructors and initialize them safely. In case you had written in C and wanted to statically initialize a struct that referred to other structs via pointers, then you’re out of luck because unless all the other structs already have an address at the point where you want to link to them statically you run into the same problem. If you were dealing with POD’s in C++ then there’s no difference with static initialization. If your case is for C being safer with static initialization then you’re at the same level as static initialization with C++ because in C there are only POD’s and POD’s in C++ are safely initialized statically too.
If you’re trying to compare static initialization of non-POD C++ types with C structs, then it’s not a fair comparison anyway right?
C gives you better control over what happens when your code is executed. When reading seek [sic] out it is fairly straightforward to decipher one code is getting executed and when memory is just restart or primitive operations are performed. In C++ on the other hand your have to deal with several potential problems:
- A simple variable definition can cause code to run (constructors and instructors)
- Implicitly generated and called functions. If you didn’t define constructors, destructors and operator= you will get them generated for you.
Control is a matter of trust. If you were using a C library that wasn’t written by you and you absolutely didn’t do anything funky in a function implementation, then you can only control the code you write. This goes the same way with C++ — or any language at that — which means if you write everything yourself then you get as much control over what your program does like in any other programming language. The “illusion” of control is essentially what’s referred to here.
The language features available in C++ — constructors, destructors, operator overloading, etc. — are there for a reason, and they are good in their own right. What they do is allow you to control what happens at different stages of the object’s life cycle. In C, you don’t even get to control these things and you resort to hacks and workarounds just to get what you need in situations where they are applicable.
C supports variable sized arrays on the stack. Which is much faster to allocate than on the heap. (C99 feature)
This is a performance “pot-shot” which is actually easy to refute because in C++ you can also make variable sized arrays that live in many places: the stack, the heap, shared memory, a file, among other places. If you’re after fast allocation, then you can use a memory pool that lives in the stack and allocate objects from there. This is perfectly fine in C++. Question becomes, in C, can you make a variable sized array that lives on the heap that is “much faster to allocate” than one in C++?
No name mangling. If you intend to read generated assembly code, this makes that much easier. It can be useful when trying to optimize code.
This is a classic. Actually this is a limitation of the compiler and the assembler, not really the programming language. The C++ programming language as far as I know doesn’t even specify the name mangling implementation! I’d be willing to bet that if a compiler were to actually not mangle C++ symbols, be C-friendly, and used an assembler that could contain the non-mangled names that this wouldn’t even come up in the list.
De facto standard application binary interface (ABI). Code produced by different compilers can easily be combined.
This is another one that is not even a feature of the programming language. This again is a consensus that the compiler writers have typically agreed upon to implement in a certain way — or, because of the way functions are implemented typically in assembler, allows for “easier” cross-compatibility. Again this is not even specified in the C standard (the ABI) and even across different compiler versions (not even different vendors), ABI compatibility is not guaranteed.
Much easier to interface with other languages. A lot of languages will let you call C functions directly. Binding to a C++ library is usually a much more elaborate job.
This is a statement of currency, not a statement of truth. This is describing what’s happening today but not really what the truth is. The truth behind this is that other programming languages have implemented interfaces to support explicitly C-style function extension. That is the only reason why C would have a leg-up on C++ and it’s not even because of C.
Compiling C programs is faster than compiling C++ programs, because parsing C is much easier than parsing C++.
Again, this is not the language but the tools used to compile the language. Even if the C language has less keywords and less constructs to parse, the fast-ness or the slow-ness of the compilation has something to do with the compiler, not the programming language. This means, because you can write complex C++ template-heavy programs and because your optimizer is running overtime to make great executable code doesn’t mean it’s C++’s (the language) fault.
Varargs cannot safely be used in C++. They’re not entirely safe in in C either. However they’re much more so in the C++, to the point that they are prohibited in the C++ coding standards (Sutter, Alexandrescu).
The limitation for Varargs has something to do with vendor-specific details on the size of objects that are placed on the varargs list. This has nothing to do with C++ per se.
In the first place, variable argument lists are inherently unsafe because you store a raw pointer to *something* but you lose the type information of that *something*. The number one advantage that C++ has over C is the ability to write type-safe code, and anytime you’re using an inherently type-unsafe language feature like variable argument lists then all bets are off, you’re on your own. This again is not a limitation of the language, but rather the nature of the beast of type-unsafe programming.
C requires less runtime support. Makes it more suitable for low-level environments such as embedded systems or OS components.
The runtime requirement of C and C++ are just the same unless you rely on runtime-type information (RTTI) in your C++ code. This means if you’re doing exceptions or calls to dynamic_cast<> or using typeid, then you do need the extra runtime support. What you get in C++ you do not get in C because unless you implement it yourself with your structs, you don’t get the features at all.
Standard way in C to do encapsulation is to forward declare a struct and only allow access to its data through functions. This method also creates compile time encapsulation. Compile time encapsulation allows us to change the data structures members without recompilation of client code (other code using our interface). The standard way of doing encapsulation C++ on the other hand (using classes) requires recompilation of client code when adding or removing private member variables.
I’ll just say one thing to refute this: the pimpl idiom anyone?
Oh, and there’s more:
Disliking C++ is not a fringe thing. It does not mean that one is not capable of understanding complex languages. Quite a lot of respect computer science people and language designers aren’t fond of C++.
Because a lot of people didn’t agree that the earth is round, then that should mean the earth isn’t round.
Like I said earlier above this post is so full of FUD, it’s not hard to understand why there’s a lot of people mis-informed about C++.
Hopefully I’ve shed light to some of the issues raised above and is helpful to you!
Do you have any more things to add to the discussion? I’d definitely love to read what you think!
Posted in insights, rant, review, thoughts


Alex Ott wrote a fascinating blog post about Boost.Spirit v2.x’s integer parser performance compared to standard C’s ‘atoi’, in one example of how C++ is faster than C in some cases. He writes:
I did very primitive test of performance for new version of boost.spirit. I read somewhere, that boost.spirit v.2 over-performs atoi on parsing of integers, and I decided to check this with help of program, shown below. With compilation in release mode, and for 10000000 iterations, I got following results…
And when you read on his post, you may be surprised about his findings. Shocked even.
I am already a fan (a huge fan even) of Boost.Spirit but this is one of those things that make me believe even more and more in the power of Boost.Spirit and Template Metaprogramming. So why does Boost.Spirit beat standard C’s ‘atoi’ implementation?
I tried to look into the assembler output, but I didn’t see any glaring signs of why the implementation would be inherently faster. One thing that I did notice in the assembler output for the test program that I wrote was that there almost a 1:1 correspondence between the C++ code and the assembler output — even if (or maybe because) it was hidden under heavy abstraction in terms of C++ template metaprogramming.
Another thing that I noticed was how the implementation was using only a handful of registers too to implement the tight loops — something I wasn’t able to verify from the C implementation of ‘atoi’.
Does this make a case for a C++ implementation to make more efficient standard C functions? If tests consistently show that a C++ implementation out-performs the C equivalent, should a case be made to flaunt that fact and maybe improve the standard C library implementations using C++? This doesn’t even benefit from the upcoming C++0x features like rvalue references, move semantics, and the consistent memory model.
What do you think?
Posted in boost, c++0x, cool stuff, experiment


As I do keep tabs on what’s going on with C++ around the web, I chanced upon this short and succinct article about ‘free’ and ‘delete’ not returning memory to the OS from Thought Garage which starts with:
When you call free() or delete(), it will NOT really release any memory back to OS. Instead, that memory is kept with the same process until it is terminated. However, this memory can be reused for any future allocations by the same process.
It’s interesting to read this and remember some horror stories I saw with memory usage on long-running processes (HTTP servers to be exact). Is this true then?
I decided to check first by reading the documentation for libc’s free:
Occasionally,
freecan actually return memory to the operating system and make the process smaller. Usually, all it can do is allow a later call tomallocto reuse the space. In the meantime, the space remains in your program as part of a free-list used internally bymalloc.
That’s not very clear although it mentions that occasionally memory will be returned to the OS. I dug a little deeper by looking at other malloc/free family of functions and saw ‘mallopt‘, and that you can actually change the way malloc/free behave (especially if you’re using GNU libc — pointers about other platform implementations would be helpful). One option that is interesting to look at is M_TRIM_THRESHOLD which:
This is the minimum size (in bytes) of the top-most, releasable chunk that will cause
sbrkto be called with a negative argument in order to return memory to the system.
So if you really want to control when you return memory to the operating system, then tuning malloc with mallopt would be a good option.
Another thing that came to mind when doing C++ development and when you know you will be making a lot of allocations of small objects from the heap is that object pools make a lot of sense. One popular implementation of an object pool is Boost.Pool — although there are some issues with using Boost.Pool’s standard allocator interface from my experience, using Boost.Pool’s object_pool and pool interface yield desirable results in situations where performance matters.
One way of getting around the Boost.Pool issues with the standard containers is by using Boost.Interprocess containers to use a stateful allocator instance that deals with a user-controlled Boost.Pool object_pool.
Do you have any tips with better memory management when developing C++ applications?
Posted in boost, cool stuff, insights, tips


After a considerably long break during the holiday season of 2009, I got back into the development mood in the beginning of 2010. After implementing HTTPS support in the development version of cpp-netlib towards the end of the year, I then moved on to one of the seemingly harder parts of implementing HTTP 1.1 support in cpp-netlib. That said, I tried to delay this as much as I can because I knew it would take me a considerable amount of effort and time to get it working. But then something happened and it allowed me to finish the implementation to a level that I would be wanting to release it in the wild for testing. What was that something you ask?
I realized that I was in a position unlike any other position before — I was doing open source development full-time and as part of a project that I am working on as a consultant. It was the freedom to work on what I really loved that allowed me to bring my A-game to the fore and just get what needed to be done. It was a labor of love not because I was being paid for it, but because I would have paid to do what I was doing. This realization is what the start of 2010 is what I’m taking along for the rest of the year.
After sharing the personal side of the story, I’ll then share the technical specifics of what I’ve done to finish the implementation. So what exactly did I do to implement HTTP 1.1 chunked encoding support in cpp-netlib? Let me highlight some of the details below.
Carve out the area for extension.
To those implementing libraries in an iterative fashion, maybe this technique does not need a name. But I am attempting to put a name on this technique which I identified while going through the various stages of iteration on the cpp-netlib implementation.
So how do you carve out the area for extension?
Sometimes all it involves is a simple “if” statement. While writing the code for a function for example that you know there may be variations on the implementation, you would usually mark it with a comment that said “FIXME: how about other cases?” or “TODO: implement other cases here.” or just remember it. What I did on the other hand is something like this:
void foo(bar & param) {
if (param.condition()) {
// implement the affirmative
} else {
throw runtime_error("I'm not implemented yet.");
}
}
Yes, it’s not rocket science, but you will be surprised to see that this works very well by carving out an area where an extension to the functionality can be implemented. Your tests will fail because for those inputs that are yet unsupported, there is an exceptional condition that you can program against. Later on in the development process, you know exactly where the implementation for the extension will be put and what the context is in which you will be operating upon. Even later on you can decide to, once you’ve implemented the extension, refactor so that you don’t see the remnants of the if statement anymore.
Keep It Simple — Hide Complexity
I believe firmly that the interface influences the implementation. The cpp-netlib interface to an HTTP client is as simple as I can imagine it — although everybody I know who has talked to me about what I’m doing is telling me that the complexity of HTTP will eventually corrupt this interface. One of the things that I’ve strove for while implementing the beginnings of cpp-netlib is to hide the complexity of the implementation details from the users of the library. That said, one of the technique I use is that of deferring the implementation to base types.
You might think I wrote wrongly there and meant deferring implementation to derived types — and I use that idiom too — but to keep the interface to a class simple and pristine, you must be willing to move implementation details to the base types. By leveraging mix-in types that serve a specific (or generic) purpose, you must break the OOP mindset of putting the details in the derived types.
To illustrate, I will show an example of where hiding the complexity in the base class would ultimately lead to a better interface.
Let’s say you have a basic_client template and want to be able to hide the complexity of the interface by keeping it simple up front. You can then do something like this:
template <class Tag>
struct basic_client {
// ...
void foo() {
// complex implementation
}
void bar() {
// complex implementation
}
};
Now let’s say you want to hide the complexity and change the behavior of the basic_client implementation based on the ‘Tag’ parameter. You can then do things like this:
template <class Tag>
struct client_base {
protected:
// normal implementation
};
template <>
struct client_base {
protected:
// special kung fu
};
template <class Tag>
struct basic_client : client_base {
void foo() {
client_base<Tag>::foo();
}
void bar() {
client_base<Tag>::bar();
}
};
This then allows you to specialize the base implementation and keep the complexity almost literally behind the scenes. This also allows you to extend the possibilities and configurations based on the ‘Tag’ parameter. You can then extend this in many different ways by just using many different tags.
Build to Change
The last technique I use is to assume that everything in the implementation at one point in time will change. By forcing myself to think that everything can change, I am more cautious about the things I hard-code or the design decisions that I make. The one rule I stick to when implementing cpp-netlib is that “Keep the interface simple and the implementation flexible.” — so at any given release of the cpp-netlib code, the implementation details can change drastically, but the interface will stay simple — maybe the performance profile gets better, the extension possibilities are higher, and more features are supported and added, but the interface should remain simple.
That doesn’t mean that the interface will not change. There will be backward-incompatible changes that will happen, and since this is the early stage of the development (I say that but cpp-netlib development has been going on and off for three years already or whatnot) there will be changes that may break from earlier tradition. However the promise of a simple interface and a powerful header-only library implementation is held in high regard.
One thing that is constant is change, and with cpp-netlib it permeates the implementation. This means any implementation detail is marked for further improvement later on and is suspect to improvement and re-implementation.
Parting Shots
And no, this is not another technique. But what I do want to share is that as I go on distilling these techniques in my head and eventually sharing it with the world, I hope this is helpful to you. This is after all an ongoing effort and discussion that is definitely interesting to me.
Do you have any techniques you’d like to share with regards to flexible C++ library development?
References



One of the more basic and fundamental idioms of C++ programming is that of RAII. It’s so simple and fundamental that C++ seems to be the only programming language that supports this idiom. From Wikipedia, RAII is:
Resource Acquisition Is Initialization, often referred to by the acronym RAII (or, erroneously, RIIA), is a popular design pattern in several object oriented programming languages like C++, D and Ada. The technique was invented by Bjarne Stroustrup,[1] to deal with resource deallocation in C++. In this language, the only code that can be guaranteed to be executed after an exception is thrown are the destructors of objects residing on the stack. Resources therefore need to be tied to the lifespan of suitable objects. They are acquired during initialization, when there is no chance of them being used before they are available, and released with the destruction of the same objects, which is guaranteed to take place even in case of errors.
However some people (C++ programmers at that) seem to still be confused by this idiom. One thing I read while keeping tabs on the web for C++ articles is this one from jalf.
He writes:
But let’s get the name out of the way first. RAII stands for “Resource Acquisition Is Initialization”. And if you’re not already familiar with the idiom, then this has told you nothing at all. If you did know about RAII in advance, then you can, when you stop and think about it, kind of see how the name relates to it… vaguely… sort of.
What it actually means is simple: Resources should be managed by classes. When the class is initialized, the resource is acquired (hence the name). When the class is destroyed, the resource is released. And the lifetime of the object should exactly match the desired lifetime of the resource. That sounds obvious, and many programmers will (assuming they’re working in a language that has classes), say that this is what they always do.
Fortunately, RAII is so simple sometimes all you need is common sense and a good understanding of what in C++ is called “scope”. Let’s look at what “scope” means to better understand how RAII really works.
First, an example: a class that contains an integer.
struct foo {
static int a;
foo() { ++(foo::a); }
~foo() { bar(foo::a); }
};
There’s nothing spectacular about the code above. All it does is when the type “foo” is instantiated, the class member int “a” is incremented. When an instance of foo is destroyed, then the class member int “a” is passed to the function “bar” as an lvalue argument (let’s talk about lvalue’s and rvalues at another time). At this time, it’s not important what “bar” does, but let’s assume it doesn’t throw (because throwing in destructors is bad practice).
Now let’s look at how different instances of “foo” at different scopes behaves:
#include <iostream>
using namespace std;
void bar(int & a) {
cout << "Bar: " << a-- << endl;
}
struct foo {
static int a;
foo() : { ++(foo::a); }
~foo() { bar(foo::a); }
};
int foo::a = 0;
foo global;
int main(int argc, char * argv[]) {
foo local;
{
foo local_in_anonymous_scope;
cout << "Exiting anonymous scope." << endl;
}
cout << "Exiting function scope." << endl;
return 0;
}
What we should see is the following output:
Exiting anonymous local scope. Bar: 3 Exiting function scope. Bar: 2 Bar: 1
Here we see that “bar” is called three times: first when local_in_anonymous_scope is destroyed because it goes out of that local anonymous scope, second when the main function scope is destroyed, third when the global scope is cleaned up. Notice that we didn’t need to use either “new” or “delete” in this situation because we’re fine with using instances on the stack. There may be times where you would want to put data on the heap — but I won’t go into that now.
For what it’s worth, the simplicity of this approach lends RAII very well to wrapping resources in an object and making sure that once an object is in a “ready” state, that you can use the object — and the resource — just fine until the object goes out of scope (or gets destroyed). One good example of the use of RAII is with Boost.Thread’s scoped_lock object instance that you can only instantiate with a reference to an existing mutex object that you can place on the stack; once the stack is cleaned up, the lock on the mutex is released without having to explicitly releasing the lock.
Do you know of any interesting uses of RAII that you’d like to share?
Posted in community, insights, tips


This is my first post from wordpress.com and if you are reading from the RSS feed and you have time to check it out, you can go to http://cplusplus-soup.com/ — I hope you like the new look and home!
While keeping tabs on the web and what’s going on with C++, I ran into this article by David Narvaez that said something about using Python in C++:
… Yet, a topic that was left aside was embedding Python in C/C++ code, which we all knew was possible but we didn’t know how. Since then, and maybe because of my happy experiences embedding Lua and QtScript in C++, I promised I’d get into that, and, using Python documentation, some blog posts and a lot of trial and error, got it working a couple of months ago. …
Once in a while I run into a great article like this complete with code samples, but I’m not particularly impressed with the solution in general. For one thing, I would have wanted a more C++ oriented solution than one that uses the bare Python C extension directly. Was there a better way?
I then looked around and found that this is already possible in a clean manner using Boost.Python. The relevant information can be found in the Embedding section of the documentation which I’ve quoted here:
Boost.python provides three related functions to run Python code from C++.
object eval(str expression, object globals = object(), object locals = object()) object exec(str code, object globals = object(), object locals = object()) object exec_file(str filename, object globals = object(), object locals = object())eval evaluates the given expression and returns the resulting value. exec executes the given code (typically a set of statements) returning the result, and exec_file executes the code contained in the given file.
The globals and locals parameters are Python dictionaries containing the globals and locals of the context in which to run the code. For most intents and purposes you can use the namespace dictionary of the _main_ module for both parameters.
What’s interesting is that because a Python object is conveniently wrapped in a C++ object, we can deal with the results in a manner that is more in-line with C++ semantics. This means we can write code like this (still quoted from the documentation):
object main_module = import("__main__"); object main_namespace = main_module.attr("__dict__"); object ignored = exec("result = 5 ** 2", main_namespace); int five_squared = extract<int>(main_namespace["result"]);Here we create a dictionary object for the _main_ module’s namespace. Then we assign 5 squared to the result variable and read this variable from the dictionary. Another way to achieve the same result is to use eval instead, which returns the result directly:
object result = eval("5 ** 2"); int five_squared = extract<int>(result);
Knowing this I just might be able to make use of it in my current and upcoming projects where I would need an extension mechanism that is as powerful as the Python programming language.
Do you know of any other cool ways of embedding other programming languages within C++ programs?
Posted in community, cool stuff, library, tips


I chanced upon this article that gives a take on the coding guidelines that popular projects post for code portability. In particular, jasper22 says:
This document is quite a briefed [sic] guide to check whether your coding practices in C++ are standard or not and to provide possible fixes for the same. This document was written after being fed up by the non-portable codes posted on the Usenet/Forums by many (many many) beginners.
The document is short and to the point with code examples. However, he also goes on to say this:
I came across a guide, “C++ Portability Guide” from Mozilla Developers’ Center[MCD's C++ Portability Guide]. A point I noted (which upsets me), while I was reading it was that, the guide focuses on the popular coding practices in C++. It often gives you advices which will surely make your code portable but on the cost of degrading your language quality.
I personally do not contribute to Mozilla C++ projects and if I did, reading the guide I will have some trouble especially since I use and almost rely on the Standard Template Library. I also definitely rely on C++ templates for most of my libraries and as such I may not be able to contribute to any Mozilla project because of this section:
Don’t use C++ templates unless you do only things already known to be portable because they are already used in Mozilla (such as patterns used by
nsCOMPtrorCallQueryInterface) or are willing to test your code carefully on all of the compilers we support and be willing to back it out if it breaks.
I am not sure how that works out — I haven’t checked the Mozilla code yet to know what these patterns are.
One other notorious C++ Style Guide is that of Google’s. The Google C++ Style Guide is chock full of details (it comes as an XML file and will not be surprised if it’s actually enforced by a lint tool internally for them in the form of scripts and/or compiler enhancements). One of the sections that I point out which I definitely have trouble with is this:
Boost
Use only approved libraries from the Boost library collection.
As an avid Boost fan and advocate, I am a little put off by this. However, I am not surprised that for a company as big as Google with enough experts per square inch to merit reviewing the interesting Boost libraries that they intend to use internally, that this is a policy. Still, it’s going to be hard enough to get your library into Boost — being able to use it in a Google Open Source Project may be a problem in an entirely different league.
So my take on this is this:
In a nutshell, coding guidelines are just there for organizations of any considerable size to abide by to help collaboration and foster an organized approach to quality control. There is nothing wrong with Style Guides and as long as you know why the items in the style guide are there and that it’s something that’s actually helping the development and maintenance of software, then that’s even better. Style Guides that prohibit certain practices are there for a reason and even if the reason is not always valid or acceptable, it’s a matter of policy and is therefore a question of whether you agree or disagree.
I as an open source developer and erstwhile contributor to the Boost C++ Library, I abide by the Boost C++ Style Guide which is liberal and very forgiving — and still, standards compliant.
Although I think the C++ Standard is enough of a coding guide, I don’t think other style guides should not exist.
How about you, what do you think?
Posted in community, thoughts


This is a temporary post that was not deleted. Please delete this manually. (901e775b-0d98-412b-8cc2-fff86ce64b15 - 3bfe001a-32de-4114-a6b4-4005b770f6d7)


While catching up on my C++ reading, I chanced upon this article from Intel Software Network by Dr. Dick Brown of St. Olaf College in Northfield, Minnesota which is basically pointing out the writing on the wall which I’ve already pointed out to colleagues while I was still in college. From the article:
This calls for teaching parallelism at all levels of the CS curriculum, starting with CS1 (introductory CS), because the need for learning to reason in parallel can no longer be just a subject area, like AI or HCI, but must become a universal skill, like programing or problem solving. Parallel thinking must become an integral part of CS education, like knowing how and when to write a function definition or having the skills to wield data abstraction and encapsulation effectively. Soon after our students are comfortable with “for” loops, we must give them a chance to write parallel “for” loops, together with some criteria for knowing when parallel “for” is appropriate. Among the various properties we point out with a new data structure, we must consider scalability: would you use a given strategy with gigabytes of data? Petabytes? What measures might one need to insure that an algorithm is thread-safe in its access of data? We used to safely hide these parallel computing concerns away in the hardware or in system code, and relegate the teaching of those issues to an upper level course or two, but now everyone needs to become educated in parallelism.
The above quote (with bold emphasis mine) shows that everyone who’s still in college or getting into college and studying computer science should start learning on their own about parallel computing (or at least leveraging the instructors who do have experience with it) because things in college are not going to change as fast as the technology is changing. I personally have seen the trend to parallel computing at various scales to be the next wave which computing will be riding as far back as 2003 (and 2002 when I was looking for an area/field to specialize in). Now that the time has come for 48-core experimental processors from Intel nonetheless becoming a reality, there’s no more time to waste hoping that programmers will have to avoid the intricacies and the basics of concurrent programming and parallel computing.
Maybe the tools will advance much like how Moore’s Law allowed programmers to get sloppy about straight-line sequential performance of code opting instead for non-robust scripting-style languages. Now that Moore’s law is moving sideways (horizontal scalability, adding more cores instead of faster processors) programmers left in the dark about how to do proper and efficient multi-threaded programming will have a hard time adapting. This is the next scarcity that all future businesses and consumers will rely on to be filled: the skills and experience required to develop applications that leverage the parallel computing paradigm. I personally do not believe that even if the tools advance to a point that allows programmers to hide behind abstractions for parallelism that the situation will be remedied — after all someone’s still going to have to write the libraries, the virtual machines, and the systems and then implementing the algorithms and working within the paradigm of parallel computing.
So if you’re a student right now thinking of going into Computer Science, better choose a school which has facilities, curricula, and instructors that cater to and offer parallel computing as part of as many subjects as possible. If you don’t have that luxury of choice, you can always keep abreast with the times and learn on your own leveraging the vast bodies of research that has already been done in the parallel computing field since the late 60’s to late 70’s.
If you are a professional programmer already now and would like to equip yourself to have relevant skills in the next decade, I suggest you start picking up languages like C++ (C++03 and C++0/1x) and Haskell that offer better support for concurrency at the language level that get translated to actual machine code. Once you are able to get the basics done and get to an intermediate level of proficiency with the tools and the technologies (and the concepts more importantly) I’d say it’s a skills upgrade that you owe yourself to stay relevant and competitive in the coming years.
How about you, what do you think about the current state of Computer Science education and the acceleration of the need for more parallel computing expertise and skills in the industry today?
Posted in c++0x, insights, parallel


I ran into this interesting article from Wt which reads:
As they only say that “the bulk” is running PHP (edit: for those of you to lazy to read about the Facebook architecture [1], that is solely Apache/PHP, no database, no memcache, and to quote Jeff Rotschild of Facebook: “the need for those is a function of the runtime efficiency issues of PHP” [5]), let’s assume this to be 25 000 of the 30 000 (edit: and this would be in line with other bits of info that they run around 800 dedicated memcached servers and a few thousand database servers). If C++ would have been used instead of PHP, then 22 500 servers could be powered down (assuming a conservative ratio of 10 for the efficiency of C++ versus PHP code [4]), or a reduction of 49 000 ton.
This is interesting and speaking from personal experience I would say this is not unheard of. For instance, I was working on a C++ project that could handle a sustained actual 5000 requests per second over HTTP 1.1 persistent connections. This is on the server side.
One thing I’m trying to do with cpp-netlib and why I released the HTTP Server is to be able to allow others to take the embeddable HTTP server and create a scalable and high performance C++-based web back-end that competes or can be as stable as the Java Container model. It’s no secret that the performance implications of being able to have good memory management, doing only what’s necessary, and working with best practices does a lot for offering great web applications.
Maybe one day I can integrate Wt with cpp-netlib so C++ developers interested in doing web application development can benefit from the performance and simplicity of the cpp-netlib HTTP server template and the wealth of widgets already available in Wt.
I’d definitely love to know what you think about this!
Posted in insights, library, project


I’m now back from an extended holiday break and am getting back on the more active blogging course. My previous engagement with a full-time client has ended and I now have more time to concentrate on three things that I absolutely love and am passionate about.



After a long couple of days working on some requirements of one of my consulting projects and optimizing my workflow, I’ve moved forward with moving the development of The C++ Network Library to Github. In the span of a few days, I’ve managed to move the cpp-netlib project to Github and release a new version of memcache++ (0.11.4) which is being developed in Github too.
Today marks the first day of the cpp-netlib project on Github which allows both interested contributors and users alike to fork the repository, make their own changes, and ask to have the changes merged to the main project. The project page and the developers mailing list is still up on Sourceforge, and over the next few months depending on the pace of development/adoption on Github.
I will also be working on the following features I’ll be implementing in the next month or so to release version 0.5 in preparation to releasing the 1.0 version hopefully in time for BoostCon 2010.



So after years of loyalty to Sourceforge on some of my open source projects, I’ve elected to move two of my current C++ projects to Github. Starting today the memcachepp client will be developed almost actively on Github, and soon the cpp-netlib project will be moving to Git too and will also have a repository on Github. The mailing lists will still stay on Sourceforge and most of the tracker bugs may stay — although I prefer tracking issues on Github from now on too.
Why am I moving to github? Here’s a quick rundown on my personal reasons:



So recently I’ve been doing more and more open source C++ development over at cpp-netlib cleaning up the implementation of a simple URI library for the release of version 0.4. I still haven’t gotten myself that MacBook Pro that I’ve always wanted to get but I did settle for an Acer Aspire One. That said I have quite a number of things to share regarding the experience of doing programming on a netbook. If you’re interested in reading further there’s more after the jump.
Update: Changed typo in Title.
Before you go ahead and say “what am I thinking programming C++ on a Netbook with barely half the power of any decent desktop?!” I want to say I’m doing this to get a better idea of what these machines are like. I have been taking this netbook with me pretty much wherever I go and am able to write and read code just fine with the screen. That said, there are some caveats as you might expect me to say.
The Environment
Just right off the bat I would like to first describe the environment I’ve been working on. I have an Acer Aspire One D520-1Bb that originally came with Windows XP. I have since then wiped it clean and installed Ubuntu Karmic Koala Netbooki Remix on it and have installed the necessary build requirements.
I personally use Boost.Trunk for all my open source development that relies on Boost (which is pretty much all my active open source C++ libraries like memcache++ and cpp-netlib) so I have subversion and git installed too. That said, I have a lot of disk space available for programming and anything else I pretty much do (140GB of goodness). However, I only have 1GB of RAM to work with so I stay within reason when running applications.
The Compile
One of the first questions that cmoes up when programming with C++ that uses any sort of modern C++ techniques (like template metaprogramming) is “how long is the compile time?”. The answer to the question is “it depends” and that’s generally dependent on what the compiler you’re using is and what your machine specs are. We already know I have an Intel(R) Atom(TM) CPU N280 @ 1.66GHz (which is a dual-core Atom) and 1GB of RAM available. What better answer than to actually time the building of cpp-netlib which is a header-only library that leverages C++ template techniques for extensibility and modularity (and also uses Boost.Spirit 2.1 which is one of the best examples of the power of template metaprogramming). Using GCC 4.4.1 and the latest Boost.Trunk checkout set up to be my BOOST_ROOT, I get the following numbers:
$ time bjam -a
…
real 12m23.277s
user 10m14.498s
sys 1m8.464s
Yes, that’s true. And no, the numbers are not impressive. But hey, if you really needed to see performance of C++ development on an Atom Netbook with the modern C++ approach there, then what do you expect?
The Lesson
Notice that I didn’t use both cores to try and compile the whole test suite and run it. After all this is a dual core, but then the responsiveness comes to a crawl for everything else that I do. Basically the following is the list of things I learned based on trying it out full time:
Have you had similar or better experiences with software development on Netbooks yourself?
Posted in boost, experiment, library


The Dubai meltdown is another disaster caused by groupthink. No one in the emirate was willing to question the soundness of its development plan until it all came crashing down.
One of the most difficult things in Asian business is encouraging a culture of frankness and the willingness to challenge opinions of one's co-workers, even one's superiors. In most Asian cultures, conflict is something to be avoided at all cost. You will almost never hear an outright argument in an Asian boardroom.
Yet conflict is essential in any healthy organization. An organizational structure naturally puts people in conflict with one another - Sales is in conflict with Operations since more sales means more of a burden for Operations, at the same time Sales may be selling something that Operations cannot effectively deliver. Finance is in conflict with other departments as it seeks to control costs, while at the same time it may be hampering the ability of the departments to operate effectively. And there should be a natural conflict between the CEO and his department heads since it's his role to critique the others' work, while the department heads should question the soundness of the CEO's overall plans.
If people care about their work, they will end up coming into conflict with one another. Many of these conflicts can be resolved amicably, but not all of them. Either an outright conflict can occur, or people just bury their disagreements in silence. In most Asian cultures, an outright conflict is usually taken as a personal attack, and permanently harms working relationships, which is why they're usually avoided. Burying disagreements in silence is the more common choice, though very little gets resolved in this route.
The worst case is when people stop caring about their work! Conflict stops, everyone is enjoying cordial relationships, but a disaster is lurking just around the corner.
What you want to encourage in your company is a culture of "Creative Conflict". Everyone in the company should come to expect that conflict, even outright arguments, are a natural part of their work environment. In an environment where everyone cares about their work, each one should be willing to argue their opinion on what they think is best for the company. Each one in turn should be willing to listen to the logic of another's argument, argue back if necessary, with the intention not to win but to find the best solution to the issue - something that's not easy to do when tempers are already involved.
At the end of the day, people should have the attitude of considering arguments as all part of a day's work. Arguments should never be taken personally. Two people who just had a yelling match during the afternoon should be able to have a drink together after work.
It's a huge cultural leap for most Asian companies to embrace conflict as a part of their culture. The amount of effort to change people's mindset about conflict is huge since it's ingrained in our upbringing. However, if you seek an environment where people care about their work and where problems are resolved quickly instead of being swept under a rug (where they continue to grow), a culture of creative conflict is key.

