Comment spam is too much

First, I want to apologize to all those who have put good comments up on this blog and I haven’t approved them for days or weeks. I just don’t have the time to be checking all the comments I get.

Second, I’m certain that I have deleted good comments and for that I apologize.

Lastly, I’m planning on fixing this ASAP. I’ll either be moving to so other blog system like Google or implementing some more spam controls here. I haven’t decided which, but I would love to hear your ideas and suggestions. So, add comments to this entry with your ideas and suggestions and I’ll be sure to approve them all as quickly as possible.

Ego interviewing

I received a reply to my Google post today. Since I get a lot of spam, I filter everything on this blog. Usually any comment that comes in I just make sure it isn’t porn or spam and approve it. This one was extremely interesting, but not very compelling, mostly because it was posted anonymously. This could mean it is a Google employee that didn’t want to be found or just some random person trolling (I do have his IP address however – hehe). Anyways, the comment is a personal attack, so at first I took offense and got defensive. But the more I thought about it the more I began to think about interviewing again. Here’s the comment:

Wow, this entire article is littered with hubris and cocky remarks, but the summary takes the cake. Why didn’t Google hire you? They were trying to best you in a battle of wits and you were too smart! Of course that must be it since you’re an algorithms God!

Seriously Bro, you need to check yourself. Your ego, not Google’s, probably cost you the position.

First things first… I want to clear the air about my post and this comment. The majority of my post has no cocky remarks or hubris. It’s just what happened, pretty simple. I agree that my post got somewhat more subjective near the end, but I was forced to draw conclusions because of the treatment I received. Just to give you perspective, the Google recruiter left me a voice mail telling me I was not selected because they “thought my coding skills were not very good”. You read that correctly, A VOICE MAIL. This was very unprofessional. My phone and Boulder interviews were great and very productive. However, my treatment in California and after was very poor.

As for being a God of algorithms, I clearly state I’m not. I had trouble with some of their questions and that’s the point. They want to challenge you, which I completely agree with.

As for egos and battles of wits, I disagree with the commenter and this is what I’ve been thinking about since I read this comment. Ego is an interesting beast. In an interview, there are a few of possibilities, two of which are:

1. The interviewee has an ego and think they are better than the interviewer or the company. In this case they usually just answer the questions and if they gets stumped it is my experience that they start arguing. In most cases the arguments are defensive and without any pragmatic basis.

2. The interviewer has an ego and think they are better than the interviewee. In this case the interviewer isn’t out to find good candidates. They are out to find the candidates that will make them feel smart. If you answer their questions well, try to create good dialogs or introduce any pragmatism, you’ll probably get the toss.

So, what happened with me? Well, I have no doubt that the dictionary question threw me off a bit. However, I was definitely excited about being able to interview at Google. Even though I didn’t know the answer, I was interested in trying to figure it out with a good dialog. My first reaction was that it might be an early optimization because it could be a complex solution. In truth the optimization isn’t complex, but I didn’t know the answer at first. So, I wanted to begin a dialog about it and try to work through it. This didn’t go over well. In the end the interviewer got annoyed and just gave me the answer after I asked for it (he was about to move on without finishing that question when I stopped him).

Next was the hiring error question. This question had very little information around it and I again was interested in solving it. I did what I normally did and asked some questions, trying to start a dialog about it. This again didn’t go over well. The interviewer actually tried to solve it. I’m not making that up, it actually happened. But he couldn’t. This really made me think that this interviewer had a lot of ego and access to a list of questions to pick from. He picked a few, but hadn’t actually solved them all. After the interview I talked a bit with my local mathematics wiz (2 master degrees and almost a PhD in case people think he’s just some quack I work with) just to see if the interviewer and I were missing something. He confirmed that the question didn’t make sense given the information and that the solution I posed was correct without more information and bounds. You have to reduce the error to fix things.

Lastly, my comments about the tag-along were completely subjective and editorial. I personally thought it was poor form to bring along someone who wasn’t going to participate in the interview. It was uncomfortable and made it difficult to concentrate. Was this ego, maybe. But probably just plain nerves.

Now, I think the commenter was specifically thrown off by my summary and once he had finished that, he forgot about the other stuff I wrote. This summary was completely editorial and just a plain old guess as to the end result. The reason I suggested that I was “grilled” because of my resume is that I’ve had a number of friends interview at Google. Many of them never got the questions I did and my phone interviewer even told me that his questions were the hardest that Google gave. So, why, when I did so well in three interviews prior, would everything fall apart in the end? How was it that when I called my friends at Google they were astonished that I was hired?

In honesty, I don’t know. So I have to hypothesize as to what happened in California. During both interviews I had in California I had a feeling that I wasn’t really a candidate. I wanted to get a sense for what life was like at the company. I asked the first interviewer if people went out for drinks or if there were company activities and his answer was, “I’ve got kids and I don’t do that”. Another interviewer took off part way through the interview. So, given my experience I drew out some conclusions. Were they accurate? Who knows, but I did warn readers that I had nothing to back it up. As for my points:

Have I worked on huge systems? You bet.

Are the systems I’ve built larger that the majority of other engineers? Yes. 3000-5000 servers, distributed, etc.

Do I think that I write solid code? Absolutely.

Do I think I could work at Google on huge systems? Yes.

For these reason, I wrote my summary and made a guess as to the result. This conclusion I came to was based on experience. Having been part of interviewing processes at a number of companies I have seen a few interviewers just clobber lesser candidates but hire them and pass over good candidates. Therefore, I believe that it is fundamentally important to ensure your interviewers are doing a good job and working in the best interest of the company. If they are allowing ego to interfere with making good selections, they shouldn’t be interviewing. Lastly, my summary only applies to my experience in California. My phone interview and Boulder interview were great. Not a single sign of ego during either.

Google interviewing

The phone screen

Question #1: How would you go about making a copy from a node in a directed graph?

This I had done before for Savant’s dependency tree. I had never distributed this type of operation, but since the operation is simple on a single machine distribution seems to be just a matter of dividing and then sharing of information via standard distributed algorithm operations like scatter, collect, etc.

Question #2: How would you write a cache that uses an LRU?

This I had also worked on. At Orbitz the hotel rates were cached in a very large cache that used an LRU. The LRU was actually from Apache, but the principle was simple. The cache was just a large array that we used a double hash for (for better usage). Collisions were immediately replaced although a bucket system could also be used to prevent throwing away recent cache stores. The LRU came from Apache, but was just a linked list of pointers to the elements in the hash. When an element was hit, that node in the linked list was removed and appended to the head. This was faster if the hash stored a struct that contained a point to the node in the linked list.

Question #3: How would you find the median of an array that can’t fit in memory?

This question I had some problems with. I wasn’t fond of theory and numerical computation at school and so I did what was needed and retained very little. Not to say that I couldn’t learn the material, just didn’t interest me all that much and to date I’ve never used any of it. Of course if I was going to work on the search engine code at Google, I would need to brush up. Anyways, I start thinking about slicing the array into segments and then distributing those. Each agent in the grid could further slice and distribute to build a tree. Then each agent would find the median and then push that value to its parent. That is as far as I got because I didn’t have the math to know if there was a algorithm to use those medians to find the global median. Well, after the call I busted out the CLR and found that all this stuff is called “Selection” algorithms. There is one algorithm that does exactly as I described but it then takes the “median-of-medians” and partitions the entire list around that value. This could probably be done on the grid and there was a PDF paper I stumbled across that talked about doing things this way. I’m not sure that is the best call though. After thinking about it more I wondered if the distributed grid could be a b-tree of sorts that uses the data set’s (e.g. real numbers) median value (always a known quantity if the data set is known) to build the tree. Once the tree was built you just recursively ask each node for its count and then ask for the ith element which would be i = count / 2. Again, I couldn’t really find anything conclusion online to state this was a possible solution.

Question #4: How would you randomly select N elements from an array of unknown size so that an equal distribution was achieved for infinite selects?

This one I sorta understood and got a very minimal answer to. I started by thinking about selecting from the array N elements. If there were more elements, you continue to select until you have 2N elements. Then you go back and throw out N elements at random. This operation continues until the entire array has been consumed and you now have a set of N element which are random. The problem is that the last N element in the array got favored because they only had the opportunity to be thrown out once, while the first N had the opportunity to be thrown out L / N times where L is the length. The interviewer then mentioned something about probability and I started thinking about the Java Memory Model (this really have little in common but that’s where my brain went). The memory model promotes Objects into tenure after they have been around for a while. In the same sorta system as you get to the end of the list the probability of keeping those last N elements is very low. You apply a probability weight to those elements that determines if they are kept or thrown out. Apparently you can solve this sucker by selecting a single element and applying some type of probability function to it and deciding to keep it or not. I again had little luck finding information online about this, and I ended up reading up on Monte Carlo, Marcov Chains, Sigma-Algebra and a ton of other stuff, but I have yet to put it all together to make a reasonable guess to the probability function. Since the length of the list is unknown, the function must use the number seen thus far in calculating the probability of keeping an element. And, in order to handle an array of length N, it must select the first N always. Therefore, it must have some mechanism for going back and throwing values out. So, I think I was on the right track, just didn’t get all the way to the end.

Boulder office

Second round of interviews was at the Boulder office. I interviewed with two guys from there and again things were very specific towards algorithms. The questions I received from these two were:

Question #1: If you have a list of integers and I ask you to find all the pairs in the list whose sum equals X, what is the best way to do this?

This is pretty simple brute force. You can iterate over the list and for each value figure out the value necessary to sum to X. Then just brute force search the list for that value. You can however speed this up with a sorted list by proximity searching. You’ll be able to find the second value faster based on your current position in the list and the values around you. You could also split the list into positive and negative around an index, saving some time. You could hash the values. There are lots of ways to solve this sucker, it just depends on the constraints you have, the size of things and what type of performance you need.

I had a few other questions in Boulder, but I can’t for the life of me recall what they were. Nothing too difficult. Just all algorithms and time calculations (big-o).

California office

Last interview was in California and they flew me out for it. I was put up quite a ways from the campus in a slightly seedy motel. Hey, I figured they would Microsoft me (Microsoft put us up at the Willows Lodge for the first MTS, which is a 4/5 star in Washington. Very posh) because they are Google and have a few billion in the bank. I wonder if they did that to throw me off a bit and see how I did under less than optimal conditions, but who knows.

Question #1: Find all the anagrams of a given word

This was the start of my demise. I asked the interviewer why we were going to optimize the dictionary when it was only a few 100K of words. Well, that was the wrong approach for sure and we started arguing about the size of the common dictionary. He said it was like 1 million words and I said it was more like 150K. Anyways, doesn’t matter really, but this interviewer and I had a personality conflict and his accent was so thick I kept having to ask him to repeat himself. I think this is why they didn’t hire me.

Anyways, this problems simple. Just sort each word, use that as the key into a Map whose value is a list of all the words with those letters. Of course he and I were arguing so much about things by this point I didn’t get the answer and he had to tell me, but the answers simple.

Question #2: If you have an algorithm for selecting candidates that has 10% error and you only select 10% from the available list, how many bad candidates do you select given a list of 100? How can you reduce the number of bad candidates if you can’t fix the error?

I was asked this question by the same interviewer from Question #1 and we were still at odds. Well, I started to solve it and he kinda got annoyed. He got up and started trying to solve it for me (strange). Then he sat there for 3-5 minutes trying and eventually gave up without the solution. The best part was when he said, “just trust me”. I was trying to do the math to solve this thing and he couldn’t even start to do the math and finally gave up. This really tipped me off to the fact that Google has a list of questions that interviewers can pick from and this guy picked one and forgot the solution. He hadn’t ever solved this problem himself, that I’m sure of.

As for my solution, I wanted to see the set sizes reduce as the variables changed. If you have an error of 10%, that means you either throw out 10 good candidates or hire 10 bad candidates from a pool of 100, or some mixture of those (e.g. 5 good 5 bad or 3 good 7 bad). Since error is fixed the only way to reduce the number of bad candidates hired is to reduce the initial set size. You want to reduce 100 to 10 or 5. That way you minimize your error. The issue then is that since error is fixed, over time you still hire the same number of bad candidates. As you repeat the process using a smaller set size, you eventually end up with the same number of bad candidates as you would with the original set size.

So, and this is what I argued with the interviewer, the only real solution with the information provided and without making large assumptions is to reduce the error. You have to somehow fix the problem of 10% error because it doesn’t matter in the long run what the set size is. Of course, he didn’t want to discuss that issue, just wanted me to solve the original problem.

Question #3: More discussion of the sum problem

We talked more about the sum problem from the Boulder interview. We discussed reducing the processing time, finding the sums faster and pretty much all the permutations you can think of. It wasn’t too bad and the guy interviewing me for this question was okay. One strange thing was he had a tag-along that was a new hire and the tag-along kept sorta smiling at me. That was really disconcerting. Again, this seemed like they were trying to throw me off or something. This was very unprofessional in my opinion, but whatever.

Question #4: Two color graph problem

This is the old graph coloring problem that everyone gets in school. He wanted some pseudo code for this sucker and I busted out a quick and dirty recursion for it. Of course there were issues in my code, because I only had about 2 minutes to do it. We fixed those and then tried to solve it without recursion. This is always pretty simple to do as long as you can tail it. Just use a linked list and anytime you hit a graph node that has links, add those nodes to the end of the list. Then you can just iterate over the list until there are no more elements. This is how Savant traverses the dependency graph it builds, so I’ve done it before.

Summary

The interesting point of the entire process of interviewing with Google was that I was never asked a single question that was not algorithmic. I’ve talked to dozens of other folks that have interviewed and they got questions about design, classes, SQL, modeling and loads of other areas. I think I understand why they were asking me these questions, ego. I’ve worked on large scale systems, which probably are larger than a lot of the applications that most folks at Google work on. I mean not everyone there can work on the search engine or GMail. Most of them probably work on smaller apps and only a few get to touch the big ones. I think, and have no supporting evidence to back it up, that because I worked at Orbitz the interviewers out in California wanted to show me up. The humorous part of this is that by doing so they firmly proved their own interview question. By allowing ego and dominance to drive their hiring process they are increasing their error. I hope Google’s smart enough to put in checks and balances and remove the interviewers like this from the system, because like I mentioned above, the only way to fix the problem is to reduce the error and the only way to reduce the error is to ensure that the interviewers are going to make the best decision possible for the company and not for themselves.

Your framework sucks and I almost lost my job

Just read Bob’s post from a bit back about XML and annotations and it occurred to me that Bob’s like the rest of us. We LOVE pointing out flaws in things that annoy us! My string of “Why Hibernate Sucks” posts pulled in a ton of comments when they first appeared just like Bob’s post pulled in more comments than if he had used the P.C. name of “Annotation Myth Busters”. With my posts, as with Bob’s, the Hibernate folks pounced (the Spring guys pounced on Bob’s post) and we had some nice comments going back and forth. I decided to scaled back the titles to “Hibernate pitfalls” to be more P.C. but really I think in many ways “Hibernate sucks” and probably should have stuck to my guns a bit more.

Just to give you some context of why I changed my post names… A while ago I was working for a company and I made some comments about the programmers at that company being crazy about coding standards. Of course I phrased in a much worse manner. Well, if you did a little bit of Google searching you found my name next to these comments. The truth was the employees at the company were a bit nutso about coding standard. Member variables names must start with underscores, you must not use the this keyword, 4 spaces, no tabs, curlies on same line, no newlines, no continuation indents, etc. etc. I mean it was a like a 10 page coding standard document. Well, I loath coding standards. I code the way that makes sense to me and things like underscores don’t make sense. They are hard to type, slow me down, are ugly to me and don’t add any value except to VIM users who refuse to install plugins.

(EDIT based on comments below. I don’t hate coding standards in general. I hate being forced to use someone else’s 10 pages of coding standards that don’t feel right to me. Feel free to read my comments below to fully understand my position. Also, my moto is ‘if you can’t work in other peoples code and ignore the style, you should probably find a different type of job’)

As it turned out, my manager found these comments. He, luckily, was an awesome guy and very savvy about corporate communication. We talked about the comments and I went back and changed them to make them more PC and of course removed the company name. So, now I’m a bit more careful when it comes to what I write. But the fact still remains that sometimes I really WANT to say “Hibernate sucks”. The fact of life is that its all a game and if you piss off the wrong person, they could make your life miserable or fire you on the spot. Plus, as I’ve learned, it’s much more fun to think of intelligent ways to point out things that suck anyways.

My latest Ubuntu setup

I’ll keep editing as I go:

Install LaunchBox

bash$ sudo apt-get install gnome-launch-box
bash$ gconf-editor

Open apps->gnome-launch-box->bindings and set activate to

<Super>l

This allows gnome-launch-box to be activated using the Windows-L shortcut.

Next setup gnome-launch-box to start automatically. Open System->Preferences->Sessions and create a new Startup Program. I use this command line:

/usr/bin/gnome-launch-box -t -n

Install Compiz/Beryl merge latest

In progress.

Google is an easy target

I just got done reading the latest Cringley which of course revolves around my favorite bunch, Google.

(For those who don’t know me and didn’t hear my quite amusing story, I interviewed at Google and after the phone screen, 2 Colorado interviews, and the Cali interview they told me I couldn’t code and passed on me – hehe)

Furthermore, Philip Ogren and I had lunch yesterday and he mentioned that he interviewed at Google. He recalled one question that gave him particular trouble, which he believed to be the question that lost him more interviews. It was something like, “from an infinite stream of numbers select the highest 10%.” I had many similar questions such as “How would you find the median of an array that can’t fit in memory?” and “How would you randomly select N elements from an array of unknown size so that an equal distribution was achieved for infinite selects?” These questions are of course statistics based and selection questions, but Philip and I both could not definitely answer these questions for the interviewers. However, I moved on and he did not. In fact, he might have answered the questions better and definitely has more credentials than I do (he’s working on his PhD and I’m obviously not). So, before I get back to Cringley, what’s up with that?

It’s simple. Error is deterministic, guaranteed and often fixed. His interviewer was probably someone different than mine and probably had a larger ego than mine and this introduced error. My phone interviewer was a very cool dude who actually had a PhD in particle physics, but gave up research, probably for more money. Okay, back to the topic at hand – error – Google’s almost unavoidably introducing more error because smart engineers tend to believe they are better than you and this is 100% correct based on the 10-15 folks I’ve met from Google, with the exception of my interviewer and a few friends I have there. This ego-centric view of other developers introduces a high level of error when left unchecked.

Okay, so what does interviewing have to do with Google and Cringley’s belief that Google’s high concentration of smart engineers will be its down-fall? Well, it reveals the fact that Google is an enormously easy target. People talk about Googles hiring mistakes (error), issues with process and policy, and just about anything which is 100% perfect – meaning everything. Microsoft is likewise an easy target. The difference between Google and Microsoft is that Google used to be the good kid and everyone loved them to death. Now it seems that more and more people are shifting that view and looking at all the flaws and issues Google has and looking for ways to theorize their eventual decline into just another software factory that produces new versions of the same products with mundane certainty. Google is becoming just another evil empire, and if you look objectively, you can see that they are already heading full steam into the inevitable abyss where all enormous companies eventually end up. Google is just another company that everyone will love to hate but continue using everything they produce for at least another 10 years. Google self-destruction theories will become more and more common and folks will begin swapping, “did you hear Google messed up X,” stories at parties and events. This will all happen because as a society we tend to be deterministic in our love and then eventual hatred of companies on top. This cycle is quite simple to plot and if we did that exercise, Google would probably be right at the top on the verge of a lengthy but steady fall.

Vista morning ponderings

I’ve tried to post this message 3 times this morning and three times I’ve lost the post. First it was IE7 sucking and then it was my hosting company sucking worse and finally I’m starting over again. This time I’m keeping it short. Here’s my morning thoughts.

  • VIsta looks freakin’ amazing!
  • Looks aren’t everything, but help make life nicer
    • I can’t run Beryl or Compiz on my desktop because of ATI support – sucks – so linux still looks horrible even with the best theme available and hours of tweaking.
    • Even with Beryl or Compiz, X windows sucks because of the networking code at the bottom. OpenGL, FB, DRM, DRI, whatever should really be at the bottom of X and the rest of it can go. It sucks a lot anyways.
  • Vista sucks for developing. No shell with completion, no virtual desktops, you have to install everything by hand and it sucks to find good free software.
  • Linux has so much good support now, that even though it looks like butt, it is still better for developers. Any developer that say they are more productive on Windows has been smokin’ some serious crack.

MSDN is down

I just went over to MSDN to look for my product keys from my subscription which I received at last years Microsoft Technology Summit. However, looks like someone really messed up a deployment of that application and I got one of those nasty ASP.Net errors, which anyone who has ever developed in .Net knows very well.

So, I figured I’d call MSDN and let them know. Well, the MSDN office is closed on the weekends and I tried calling Microsoft support direct to let them know. However, the call center employee I talked to had no clue how to let the folks who manage the website know. When I heard that I was floored. A company as large as Microsoft should at least have so kinda of system that employees can use to raise alarms about website failures so that they get handled. Hell, Orbitz call centers could always contact Orbitz headquarters to let them know the site was down. But most of the time we already knew that since the website is monitored 24/7. Well, guess I just have to wait for them to restore the website but here’s the screen shot I grabbed just for fun.

Enjoy!

Screenshot-Runtime Error - Firefox.png