Skip to content

O Brave New Blog

05-Aug-10

… to have no posts in it.

The server that I was hosting my blog on just died, and I will (possibly) eventually pull some of the old posts out here.  But I have been wanting to move to hosting for a while anyway, so here we are.

Bonus technical tip: if your dhcp server dies and you bring up a new one, Windows 7 freaks out and decides you’re on a new network and shuts down filesharing.  Fix is to tell Windows (again) that the new network is a “Home Network”.  The new dhcp server might need to be on a different IP to trigger this behavior; mine happened to be.

Unemployment And Make-Work Jobs

05-Aug-10

Here’s something I’m unclear about: Suppose you’re unemployed and the government gives you a make-work job — e.g., digging ditches and filling them in again, or breaking windows and replacing them.  This is supposed to stimulate aggregate demand.

Then when the government ditch-filling program ends, do you get unemployment benefits?  It looks like you can get them when your census job runs out, so I would presume so.

So in a Keynesian economic system, it would be good to do that, because it would help make the recession less deep.  In a recalculation view, wouldn’t it  just prolong the recession (“recalculation period”) by creating alternatives to the “new patterns of specialization and trade” that are necessary to resume growth?

Deleting From A Collection in STL

28-Jul-10

Deleting from a collection is hard in STL. Or rather, it’s subtle. Which is roughly equivalent to hard.

Jim Beveridge talks about STL erase in his latest blog post and that inspired me to spend some time messing around with STL containers last night.

You can remove all even numbers from a set with Jim’s code:

set<int> coll;
/* ... */
for (set<int>::iterator itr=coll.begin(); itr!=coll.end(); /*EMPTY*/ )
   if (*itr%2 == 0)
      coll.erase(itr++);
   else
      ++itr;

But you can’t take this code, replace “set” with “vector”, and use it to remove all even numbers from a vector. For one thing, inserting or deleting an element in the middle of a vector invalidates all iterators that point to elements following the insertion or deletion point, so the trick of post-incrementing itr doesn’t help. You do indeed get a pointer to the next element which is valid until erase is called, which then invalidates the iterator. Fortunately erase doesn’t reallocate the vector’s storage, so the iterator is just off by one. Still, if you populate your vector with the values 1,2,2,4,5 and run it through the above loop, you’ll wind up with 1,2,5.

Here’s how to remove all the even numbers from a vector:

vector<int> coll;
/* ... */
for (vector<int>::iterator itr=coll.begin(); itr!=coll.end(); /*EMPTY*/ )
   if (*itr%2 == 0)
      itr = coll.erase(itr);
   else
      ++itr;

Note that vector::erase returns an iterator which points to the next value (and, incidentally, the same memory location). That’s because it’s an STL sequence, while set is an associative container. (There are five different STL containers: the three sequences vector, list and deque and two associative containers set and multiset.)

So, given that there are two different interfaces for erase corresponding to two different container types, how can I write a template function that deletes even elements from a templated container? I can’t just write two versions of delete_even with different implementations; they need to have different signatures. I can’t use iterator tags because container category isn’t a property of the iterator: set and list both have bidirectional iterators, but are in different container categories.

Here’s what I wound up with:

struct assoc_container_tag {};
struct sequence_tag {};

template <class T> sequence_tag container_category( const vector<T>& v ) { return sequence_tag(); }
template <class T> sequence_tag container_category( const list<T>& v ) { return sequence_tag(); }
template <class T> sequence_tag container_category( const deque<T>& v ) { return sequence_tag(); }

template <class T> assoc_container_tag container_category( const set<T>& v ) { return assoc_container_tag(); }
template <class T> assoc_container_tag container_category( const multiset<T>& v ) { return assoc_container_tag(); }

template <typename T, typename P> void delete_if( T& t, P pred, assoc_container_tag )
{
  for( typename T::iterator i = t.begin(); i != t.end(); )
    if( pred(*i) )
      t.erase( i++ );
    else
      ++i;
}

template <typename T, typename P> void delete_if( T& t, P pred, sequence_tag )
{
  for( typename T::iterator i = t.begin(); i != t.end(); )
    if( pred(*i) )
      i = t.erase( i );
    else
      ++i;
}

template <typename T, typename P> void delete_if( T& t, P pred )
{
  delete_if( t, pred, container_category(t) );
}

I defined two empty types to differentiate between the two container categories. I wrote a templated function container_category and partially specialized it for the five basic container types, returning the appropriate container category tag. Then come two container-category-specific versions of the deletion loop, using either Jim’s postincrementing iterator trick t.erase( i++ ); or the returned iterator i = t.erase(i). The entry point is the two-argument version of delete_if, which uses container_category to select (at compile-time) the appropriate three-argument version of delete_if to call.

Note that to make it look more like STL I’ve factored out the test for evenness, and now you can pass in an arbitrary predicate. And I’ve changed the routine name from delete_even to delete_if.

Why does that sound vaguely familiar?

[…]

Oh yes. Because there is an STL function remove_if which does exactly the same thing, only better, because you can give it an iterator range instead of an entire container.

So: was all this work wasted? No! Object categories are still a useful technique, and it’s good to play around with them so that they’re familiar when needed. It’s important to think about the properties of containers and their iterators, and to know which operations can invalidate iterators. I had an insight about vector::erase and iterator invalidation, but I’m not sure how to put it into words yet.

And anyway, remove_if has problems of its own, which I’ll get to in another post.

Cygwin setup.exe not a valid win32 application

23-Jul-10

I ran into this error – “setup.exe is not a valid win32 application” the other day while I was trying to upgrade cygwin (1.5 -> 1.7) on an old laptop.

Underlying problem was that I was upgrading under a different account than had been used to originally install cygwin. Switching to that account and running the upgrade was an effective workaround. (Ineffective attempted workarounds included re-downloading several times, explicitly running as administrator, swearing, and banging on the desk).

So: in case anyone else gets bitten by it, try that.

C++ Runtime Types: Find All Classes Derived From a Base Class (3)

19-Oct-09

Working on a way to find all the derived classes of a base class: part 1 is here, discussing the motivation and frame of the problem.  Part 2 is here, discussing one approach that works but not for me.

The problem is laziness: if I am in a hurry to put in a new error message and can build a version and send it to testing without updating the error class registry, then eventually I will do it.  So how can we use laziness to our advantage?  How can we make having a consistent error registry the easiest way to get the code to compile?

I found an article on codeproject that also provides a framework for runtime derived-class discovery.  It’s substantially uglier than the meatspace solution quoted in part 2, but it addresses all of the weaknesses: index reversal, multiple hierarchies, and most importantly (for me) compile-time error if a derived class is not registered.  What we’re going to wind up with is this:


class CError: {
DECLARE_ROOT_CLASS(CError);
public:
  // body as in part 1
};

// error logging function
void errorlog( const CError& );

class CDerivedError: {
DECLARE_LEAF_CLASS(CError);
public:
  CUnableToOpenFile() { Init( L"sample.txt", 2 ); }
  CUnableToOpenFile( const wchar_t * psPath,
       DWORD dwError = GetLastError() )
  { Init( psPath, dwError ); }
  format_type Format() const
  { return "Unable to open file {0}: {1}"; }
};

Later on, in the implementation file for the error system:


IMPLEMENT_ROOT_CLASS(CError)

IMPLEMENT_LEAF_CLASS(CError,CUnableToOpenFile)
IMPLEMENT_LEAF_CLASS(CError,CBadMagicNumber)
// etc.

Why does this help? Why does it make any difference at all? By designing the system this way, there is now a chain of compile-time or link-time — at any case, not runtime — dependencies that make maintaining a consistent error table the easiest way, the laziest way, to add an error.

So under the hood, what are the macros doing? DECLARE_ROOT_CLASS() is the heaviest. It declares static member functions on CError, and it declares a vector of class factories as a static member variable. (Notice that it does not use the elegant Singleton pattern that the meatspace solution did, and I might change that.) It also declares a pure virtual function on CError — GetClassID().

IMPLEMENT_ROOT_CLASS() is mechanical, it just provides the implementation of the factory-adding mechanism.

DECLARE_LEAF_CLASS() is an interesting beast. It declares a static class variable. It implements two functions that are necessary to make CDerived a concrete class: it provides an implementation for GetClassID(), and it provides an implementation for CreateObject(), required by the class factory template. If I remove DECLARE_LEAF_CLASS, I get these errors:

errorlog.cpp(50) : error C2039: ‘CreateObject’ : is not a member of ‘CUnableToOpenFile’

e\ unabletoopenfile.h(3) : see declaration of ‘CUnableToOpenFile’

errorlog.cpp(50) : error C2259: ‘CUnableToOpenFile’ : cannot instantiate abstract class due to following members:

e\unabletoopenfile.h(3) : see declaration of ‘CUnableToOpenFile’

errorlog.cpp(50) : warning C4259: ‘int __thiscall CError::ClassID(void) const’ : pure virtual function was not defined errorlog.h(31) : see declaration of ‘ClassID’

And  IMPLEMENT_LEAF_CLASS() defines the static class variable that was declared by DECLARE_LEAF_CLASS().  The initialization of that static variable (at runtime, before main() executes) causes the class to be registered in CError’s factory table.  So if I create and use an error class but forget to use the IMPLEMENT macro, I get the following errors:

File.obj : error LNK2001: unresolved external symbol “private: static class CBootStrapper<class CError> CUnableToOpenFile::s_oBootStrapperInfo” (?s_oBootStrapperInfo@CUnableToOpenFile@@0V?$CBootStrapper@VCError@@@@A)

.debug/program.exe : fatal error LNK1120: 1 unresolved externals

To recap: there is a chain of compile-time or link-time errors that ensure that when a new error class is created, it is available through the runtime error-class table maintained in CError.  The chain runs like this:
  1. Convert an old-style error message to a new one by creating  a call to errorlog( CErrorName( arg1, arg2 ) );
  2. Errorlog requires a class derived from CError, so create CErrorName as a new CError-derived class in a header file
  3. In order to derive from CError, we must supply bodies for pure virtual functions declared in CError; the simplest way to do that is with DECLARE_LEAF_CLASS
  4. External dependencies are created by DECLARE_LEAF_CLASS, and the simplest way to resolve them is to use IMPLEMENT_LEAF_CLASS
  5. IMPLEMENT_LEAF_CLASS causes the class to be registered in the error table.
There are other design issues (like, “Why did the derived error class suddenly get a default constructor?”) which I hope to address in a later article; also there are other design considerations about the error logging, the choice of fastformat, “fun” “quirks” of fastformat, getting the errors into a useful database in a useful way (in this case it means using sqlite and possibly using an ORM), which will come up later.
I also made some changes to the original version of BootStrap.h which I hope to publish soon.  I’m not yet completely happy with the templates and macros that I’m using, but I haven’t figured out what I want to change yet.

C++ Runtime Types: Find All Classes Derived From a Base Class (2)

18-Oct-09

Working on a way to find all the derived classes of a base class: part 1 is here, discussing the motivation and frame of the problem.

StackOverflow gives a pointer to meatspace (a blog), where he’s got a solution involving factory methods and a single macro per class.


class Base;

typedef Base* (*base_creator)(void);

class BaseRegistry {
    public:
        std::vector m_bases;
};

class BaseRegistration {
    public:
        BaseRegistration(base_creator);
};

#define AUTO_REGISTER_BASE(base) \
    BaseRegistration _base_registration_ ## base(&base_factory);

It’s functional: you can walk through all the registered base classes and create an instance of each.   The example doesn’t provide a way to reverse the index (to go from a derived class its position in the factory method table) but that would be easy to add.

Another minor complaint is that this example hardcodes the identity of the base class. If you wanted to have two distinct hierarchies of derived classes, you’d need to change a few things, not just a single template argument.

Another minor complaint — so minor that I hesitate to mention it — is that each of the derived classes is assumed to have its own implementation file. That’s fine for a situation where the derived classes are doing some actual work, but my error classes are very lightweight and won’t have their own implementation files. This is a purely aesthetic objection.

The main drawback that I see is maintainability: it seems too easy to fall through the cracks. I can declare a class derived from type CBase and if I forget to put the AUTO_REGISTER_BASE(CDerived) macro in an implementation file, the class doesn’t get registered and therefore can’t be enumerated. In this framework, there’s no way to enforce the rule that each derived class must be registered.

C++ Runtime Types: Find All Classes Derived From a Base Class (1)

17-Oct-09

I’m working on a program in C++, recently converted from C.  I’m redesigning the error logging subsystem to have type-safety, log-to-database, and also to make it possible to generate a list of all the errors in the system.  So what I’ve decided to do is have a base class CError, and require that all the errors derive from it:

class CError
{
public:
  typedef const char * format_type;
  typedef std::string message_type;

  virtual format_type Format() const = 0;
  virtual format_type Message() const { return Format(); }

  virtual ~CError() {}

private: // disable copying
  CError( const CError& );
  CError& operator=( const CError& );
};

And an actual error message might look like this:


class CUnableToOpenFile : public CError
{
public:
  CUnableToOpenFile( const wchar_t * psPath, DWORD dwError = GetLastError() )
  { Init( psPath, dwError ); }

  format_type Format() const { return "Unable to open file {0}: {1}"; }
};

Init is a function that converts the error code to a string and uses fastformat to format the string.  (I’m omitting some of the mechanism here, because I want to talk about the runtime type discovery issue.)

OK, so now what?  I have a bunch of types that derive from CError, and no way to enumerate them at runtime.

Well maybe I could do what ruby does, enumerate all the classes in the type system and see if they can be cast down to CError.  I could do that if I had a list of all the classes in the system, which Ruby does have; the  C++ runtime has it too (to implement type_info and dynamic_cast), but it can’t give it to me.  I still need to separately maintain some a list of (at least) the types I’m interested in.

Finding C-Style Casts

06-Oct-09

I updated this script to make it work a little better with Visual Studio (VS6 anyway).  It now uses the paren-printing convention so you can use F4 to find the next cast.

Requires perl.  Add it as a tool under Tools | Customize ; Command is the path to your perl.exe, Arguments is path to this script followed by $(WkspDir).  Check “Use Output Window”.

Updated script is here (rename extension to .pl).

In case you needed another reason not to use VC6

26-Aug-09

The MSVC6 implementation of STL omits push_back() on std::basic_string, meaning that

std::string s,s2;

std::copy( &s[0], &s[n], std::back_inserter( s2 ) );

gives a syntax error; but it shouldn’t.

ATL 3.0 (VS6.0) Control Creation Resource Leak

09-May-09

So yeah, this post isn’t about homeschooling. More…