The application that I'm working on talks to two databases. In one database, there's a table called
user
with columns like id
, firstname
, lastname
, ssn
, and so on. The information in some of the rows in this table has also been stored in the second database. This is done for reasons of efficiency, so that when querying the second database, we don't always have to dip into the first database to get user information.This seems perfectly reasonable to me, except for the way that it was done. Way back in the stone ages of our application, someone decided to store all the
user
information in a single column in the second database. It is stored as a set of key-value pairs in a CSV string, in a column called user
, like this: "user=3378,first=John,last=Smith,ssn=123456789,..."Since I first encountered this, I've been cursing scratching my head about why someone would decide to store all of this information in a single column. It makes queries more difficult to read and write, because it requires regular expressions to extract single items like the social security number from the
user
column. I can imagine this is also a performance hit.In addition, whoever decided on this format did not replicate the
user
table column names for the keys in the CSV string. So the user.firstname
column in database 1 maps to first
in the CSV, user.lastname
maps to last
, and so on. This makes life more difficult when searching the code base for all queries related to a particular piece of user information. I recently fixed a bug caused by this mapping difference.I'd love to have a conversation with the person who decided to do things this way. It would either be enlightening, or just plain crazy.
No comments:
Post a Comment