Tech: In The Fray – Distributed Computing et. al.

June 28, 2009


So okay, I have been really rolling up my sleeves, digging into the distributed computing world again. I took a break from the subject briefly as I was hemmed up being a company’s code whore, rabble rabble rabble, chomp chomp pachewy chomp :-) . “But that’s alright, cause now I’m back so kill all the rumors and straighten the facts”. I broke out of that place and joined an upstart little startup search engine that could‽ ;-) . They had the hubris to go up head to head and play with the big G. I was enamored with the place and the idea and the people and as it turned out, I learned a lot about search and building ‘internet scale’ systems. I was turned on to map-reduce and learned what it took, from A-Z, to make a big distributed system serving petabytes of data, and how make it fast and scalable. It was not without it’s trials and tribulations, but that’s where most of the good learning happens, right‽ ;-) . I have since parted ways with said company, but the fire that they rekindled in me as a big systems developer is burning bright. I have been immersing myself back into the state of distributed computing reading paper after paper and blog after blog and loving it.

I have decided that I would start my own company. I have a service/product that I want to build that I think will be HOT (read: fun and lucrative). I have been thinking about it for years now. The question at hand is, okay smarty… how do you build it right? At least make the rightest first steps I can make. So, here are the questions that goes through my mind when starting from tabula rasa.

    Questions:
    What platform? (Linux)
    What source code management system to use? (GIT)
    What language(s) do I use? (Java and Python)
    What web server should I use? (Apache or Lighttpd – not decided)
    What build system should I use? (ANT)
    How am I going to store the data? (Schema-less dunno which one yet)
    How am I going to process the data? (Hadoop MapReduce)
    What will I use to build the front end? (HTML/CSS/JavaScript – (JQuery))
    What will be the engine for the front end? (Web.py)
    How can I keep learning and having fun!? (Don’t work for anyone anymore!)

So, I started with the easy things first. Pretty much 99% of my development over the past 14 years has been on a *NIX platform, primarily LINUX so… I’ll stick with what I know, the good stuff – Plaform = LINUX. That was easy. Okay so as for what scm to use; well I was using Mercurial (Hg) at the last gig which was a refreshing change from CVS that I was using up until that point. I remember seeing a Google presentation about GIT it sounded great so I tried it… and fell madly in love! Regarding what programming language to use; well that was an interesting one. I have been coding in Java since June of 1995, it is my home language and I think it is pretty freakin’ awesome, so then Java is my primary choice of language. But Java, as great as it is at most things, I always found it il-suited for the quick changing, desultory world of web development (IMHO). I was introduced to Python in this context over the past year and a half and have seen it’s merits as a front-end / rendering engine language and platform. So I decided that having the best of both worlds would leave me with Java in the middle and back-end and Python on the front-end, using Thrift as the over the wire glue. Thrift essentially hedges my front-end language bet, being able to wholesale replace the rendering engine language if necessary. Okay so cool ;-) . Next: Hmmm… as for web servers, the jury is still out on that. Apache and Tomcat are great and tried and true, but I have been hearing lots of good stuff about Lighttpd (lighty) especially in the context of Python. As the front-end stuff is my weakest skill set I will wait to see how that shakes out. Moving on; the build system, well, as the child of Java that I am, I am going to have to with ANT. IHMO it is the only way to go. So okay, so much for the easy questions :-) …. Now the harder ones.

As I have been observing, the word on the net is that schema-less databases are in and the concept and implementations are hitting their stride. Well I am not one to go with fashion, so I took a really good look at the issues and arguments to be made both for and against schema-less solutions. It turns out that there are some strong salient points to be made for using schema-less databases ( Ex: great post from FriendFeed etc…, I can get into that in other posts but for now, trust me :-) ). For me, I have never been a fan of elaborate data schemas nor the copious SQL that went along with them. I always felt that the SQL query language was conflating data storage with data manipulation and forcing you to forecast data decisions and relationships too early in the software life cycle instead of simply capturing and persisting the data, allowing for it to be manipulated as necessary as the evolution of the program dictates; not leave you going through code and mental gymnastics on how to fit a round peg (once square), into a square hole. Besides I feel that schema-less is the better way to go for scalability and indeed simplicity. So that answers how I am going to store my data, but then what would I use? I came across a really nice blog post “Anti-RDBMS” that did a break down of a few front running candidate systems… it got me spinning off digging into these offerings. At the moment the front runner is Project Voldemort but I have not ruled out HBase or Cassandra. As I have been researching these tools I read the Dynamo paper and fell in love with it. It is one of the best papers that I’ve read in a long time. It lead me down a rabbit hole of papers (will list in a future post) that have kept me fascinated and wide eyed and eagerly learning. And pretty much that is where I am… reading and learning and excited!

Thus far, I have proven the path with respect to my nascent project. I can go from the front-end, in Python served up by web.py via lighttpd – talking Thrift – to the middle “business logic” layer – to… a the faux back-end yet to be determined. I can build and deploy it from ANT and it’s in GIT and all is well. Now I just need to settle on a back end and then get back to coding. Oh by the way, I have had to teach myself JavaScript as I have surrendered to the webbies. They win, I guess (I’ll rant on that in another post). These days you can’t do a damn project without them ;-) ! So I am having to have to become a bit of a webbie myself. After much investigation I decided to use JQuery as my AJAX toolkit. There is a good blog post from Mike Miles that was good reading and has some good links fleshing out this issue.

Okay, more on things distributed in future posts, I have lots to add and things to share, but I’ll stop here for now. I’ll also try to be more informational and objective so that I don’t sound so opinionated…. I’ll try ;-) .


Tech: My Comment on NY Times Article “Microsoft’s Vista Problem, by the Numbers”

October 25, 2008

The times had an interesting article discussing the issues with Microsoft’s slump in Vista sales and where they are not addressing the needs of the global computing zeitgeist w.r.t. netbooks et. al.
here

The thing is, Microsoft has been able to enjoy deep market penetration at the zygote of the business computing age. But let’s be clear; they have never innovated. They were born from the theft of the desktop from Xerox Parc. They have done a great job of copying or ‘buying out’ good ideas and watering them down and regurgitating them back to us. It can be argued that the only two things they got right were Excel and PowerPoint. The thing is, Mac “just works”. When people ask me what to get, I say get a Mac, because it just works. But that’s not the only thing. Mac IS UNIX. It is a Mach kernel accompanied with BSD style servers. It is essentially the NeXT workstation under the Apple. I used to own a NeXT workstation. I write software for a living so I know what I am talking about. The beauty of what Apple has done is that it took the power of UNIX and wrapped it in a beautiful user interface. On a mac you can go arbitrarily deep under the hood (just short of source code). I also use Ubuntu, and regardless of what anyone would tell you… LINUX *is* an operating system. Ubuntu has it’s flaws but it is a great LINUX distribution. And the price is great! FREE! So if you want the spit and polish of a great UI wrapped around a proven solid UNIX core (Mac), or if you are the more do-it-yourself type (Ubuntu or any other LINUX distro) then do that. But essentially this leaves not much room for Windows. The fuse is lit, people today are more sophisticated and computer savvy and don’t need Windows to tell them how to live their computing lives. We can make a choice. Windows…, no matter how many names they give themselves… Like it’s been said, “Just because you put lipstick on a pig….” :-)

Your friendly neighborhood coder
— Max G Faraday


Tech: Damn that iPhone…

July 6, 2007

Yeah yeah yeah… the iPhone, that’s all my geek pals have been buzzing about. A bunch of my buddies got the iPhone, but though it is sexy as all hell; oh, and did I mention it is sexy as all hell, I didn’t cop one. The main issues I have with the phone is that it would pull me away from my kick butt T-Mobile plan. It turns out that ATT sucks, compared to T-Mobile regarding coverage and cost (at least for my old grandfathered plan from 1998). So, how can I reconcile my need to get this phone, AND my need to stay on T-Mobile, that all my “buddies” are on? So… I am stuck. The thing is Apple has me wrapped around their finger, and the only reason I am sitting here not pulling the trigger on a purchase is because I can’t!!!! What the hell!? How long does it take them to re-up on the iPhones already!!!? Yeah, I’ll get one… it’s just to sexy. Yes, I am proof… good marketing and a good product can’t be beat. Actually the best would be to wait for some of my fellow hackers to unlock the phone (hint hint hint).

MGF aka sour grapes man.
(iPhone)


Tech: The answer is… Functional Distributed Programming

June 12, 2007

Okay, so let’s take a look at technology and let’s talk about where we are heading. Being both witness and participant in the technology revolution/evolution there are some fundamental characteristics I have noticed. Bear with me as I muse on where I see us going.

In the ‘early days’ of computing we were enamored with this new device, the computing machine that could do an amazing amount of calculations. The way we saw it was essentially as a great calculator. We used it to solve problems much the same way students use calculators for a problem sets (oh the good ol’ days, HA). Much of the knowledge and context of the problem was in the head of the user, using the computer solely for assisting in the computation. The next evolutionary step was using the computers to actually model problems. The user’s job now was to convey to the computer not just what calculations needed to be performed, but also how.

Programming languages were awesome. They allowed us to codify our problem in such a way that the computers could understand, imagine that! There are many different programming paradigms but this Object Orientd thing caught most folks attention. The dawning of Object Oriented program was a boon. Object oriented programming gave us a more ‘natural’ way of modeling the world around us. Today OO is all the rage and for good reason, it is a powerful paradigm for modeling. So what do we do with these models? Well, we put them into our big multi-purpose computer and have the models run. Even today we still look at computers as a sort of stage on which our models / software perform. As networking became more reliable and ubiquitous we have taken advantage of it… connecting our computers and distributing our software. I assert, the vacuously true statement, that ‘it is only the beginning’.

Today we are at the beginning stages of seeing computers take different forms. With the advent of laptops and PDAs and more sophisticated cell phones etc., we are at the burgeoning stages of what will shape the future of computing more to the point of this essay, shape the future of *software* and how we think of it.

So… We use OO to codify our modeling of the world… we put this model in motion on our computers. Now imagine that computers get even smaller and even more sophisticated, which they are and as they evolve we begin to embed them into our world, into the physical objects we interact with. Yes, the OBJECTS we interact with. When a model becomes THE THING is ceases to be a model. For example, say we are modeling a simple stereo. We would create our model by deconstructing it into objects that represent it’s constituent parts. Then we would have these parts sewn together in our software and run that software on our computer. What happens when the computer is small enough, and is embeded into each of the real-world constituent objects? We no longer need to model the object because the object is already there, tangible in the real-world. What we now care about is getting this real stereo to work by having the physical components actually talk to each other. What we care about is what they DO and how they interact. We care about their function. The actual name of the object becomes not so important, the partitioning of actions in encapsulated objects becomes not as important as exposing the actions to those who want to act on it.

Essentially the new shape of computers / computing hardware will begin to shape how we write software… and more importantly how we think about software. If objects no longer depend on software constructs to define them then the role of the software construct goes to zero. The key point of interest are the functions. So, in the distributed computing landscape, in an object oriented construct, we have objects that expose through “well known interfaces” the methods that they implement. The problem, as you’ve already gathered is that all interfaces are not “well known”. The problem then becomes a matter of how to discover the actions an object can perform. Probably a better way, certainly a simpler way, is to have one invocation mechansm that can be invoked that would execute some action. I know you guys are thinking what I am thinking… right? Well having functions as first class entities would allow functions to be passed around in an distributed environment to create the desired effect. Hence the future looks to be going the way of functional distributed programming. The evidence is already here when you look at these next generation scripting/programming languages that allow you to create enclosures and lambda functions, well the writing’s on the wall… we are moving closer to LISP.

Think about it… ;-)
del.icio.us This Post ;-)


Follow

Get every new post delivered to your Inbox.