Tuesday, December 11, 2007

Video System Upgrades

I recently discovered a defect in our video server code which was causing some servers to use up 100% of CPU cycles. On Friday we released an update to address this issue which dramatically improved our quality of service and the maximum number of viewers who can watch your channel. Here's a little chart of CPU usage over time, which shows the big drop on Friday:


You probably won't notice a big difference (unless you have 1000+ viewers on your channel), but rest assured we'll be ready when you become the lonelygirl15 of Justin.tv!

Saturday, December 1, 2007

Unintended consequences

We recently decided to add events on Justin.tv to search as part of the enhanced site-wide schedule we're working on.

We thought that adding something to search would just require couple pieces of work. We need a new template for the type of search result, and then we need to add events to our search index. Shouldn't be more than a 30 minute project, all told.

Of course, as I begin implementing it, I immediately notice a complication: every other kind of searchable item on Justin.tv can be sorted by page views, in addition to newness and best text match. So now we have to decide whether to either:
  • Special case the sort code and remove the page view sort for events
  • Add a page views counter for events, requiring a small change to the database schema and a small amount of code in several places. [1]
Neither of these options is particularly difficult, but only because we got lucky and the original change was small as well. We decided to add the counter, if only to satisfy our own OCD. The "just a template" change winds up touching 6 or 7 different parts of the site. This is not a fluke, it's typical. New features nearly always require changes you would never have dreamed of prior to implementation.

This is why second system syndrome is so hard to avoid: It's like invading Vietnam or Iraq. At first everything seems perfectly fine, but the deeper in you get the more unforseen complications emerge. Eventually you find yourself under fire from all directions, and you just want to get it over with before you bleed out any worse than you already are.

[1] As a side note, this is why standard databases suck: part of the reason I want to avoid adding a page view counter to events is that it's going to require making a schema change. Of course, I could create a page_views table that solves the problem generically, but that would have been more work to set up in the beginning, when I had no idea I'd have this problem. And because schemas are costly to change in a running system, changing over to that solution would take yet more work now.

Wednesday, November 28, 2007

Reminder: Leah Culver talking at JTV

Leah Culver will be talking tomorrow, 12pm PST at the justin.tv office. A live broadcast of her talk will be available on http://www.justin.tv/hackertv

More details: http://www.facebook.com/event.php?eid=5907866589

Sunday, November 25, 2007

Ruby Shorthand

Justin.tv's web code is written in Ruby. I've found myself using the same collection idioms over and over, so I've abstracted several of them into a file called shorthand.rb.



Most of our web code consists of manipulating collections in one form or another, which is probably why all the shorthand methods are for that. I'm particular proud of my % operator, which I believe is a specialized mapcar in spirit. Without further ado, code!




module Enumerable
def %(field)
map {|o| o.send(field)}
end
end

class Hash
def keys_sorted_by_value(options = Hash.new, &block)
limit = options[:limit] || size
offset = options[:offset] || 0

sorted = map.sort_by do |key, value|
if block
yield(key, value)
else
value
end
end

sorted.reverse! if options[:reverse]

sorted.map {|key, value| key}.slice(offset, limit) || Array.new
end

def hmap
h = dup
keys.each do |k|
h[k] = yield h[k]
end
h
end

def to_params
map {|k, v| "#{k}=#{CGI::escape v.to_s}"}.join("&")
end
end


class Array
def hash_by(field)
h = Hash.new
each {|o| h[o.send(field)] = o}
h
end

def center_first
centered = Array.new
last_pushed = false
each do |x|
centered.push(x) unless last_pushed
centered.unshift(x) if last_pushed
last_pushed = !last_pushed
end
centered
end

def random_subset(n=size)
shuffle.slice(0, n)
end

def shuffle
sort_by { rand }
end

def divide
evens = Array.new
odds = Array.new
each_with_index do |x, i|
if i % 2 == 0
evens << x
else
odds << x
end
end
[evens, odds]
end

def remove!(item)
reject! {|x| x == item}
end

def remove(item)
reject {|x| x == item}
end
end


Other coders: If you have your own idioms like these, I'd love to see them! Post them!

Tuesday, November 20, 2007

Justin.tv a Finalist in the Amazon Startup Challenge

I'm pleased to announce that Justin.tv was selected as 1 of 7 finalists in the Amazon Startup Challenge. From the Amazon press release:

"Justin.tv operates a massively scalable live video platform serving about 500,000 video streams per day. Their custom server software, Python Media Server, has been written from scratch to perform optimally on lightweight Amazon EC2 instances. The end result is that live video publishers have a free, easy to use, and completely scalable platform for hosting any type of live video broadcast."

We will pitch our business head to head with the other finalists for a shot at $100K in prizes and an investment offer from Amazon. The event will be held at Amazon headquarters in Seattle. Wish us luck!


Update: Amazon has posted more details about the finalists to their blog.

Monday, November 19, 2007

Second live tech talk - Peter Seibel

We've lined up our second live tech talk. After Leah talks about OAuth on November 29th, Peter Seibel will be talking about Why Syntax [Does|Doesn't] Matter, on December 13th. Peter is the author of the best Common Lisp tutorial going, and is now working on his second book, Coders At Work.

If you're close to the justin.tv office, drop by for the talk. If not, you can catch it live (and participate in the q&a) at the hackerTV channel. If you don't catch it live, the archive will soon show up on that page.

Saturday, November 17, 2007

Live tech talks

I'm very excited about the latest project we've been brewing at justin.tv: We're going to host live tech talks.

These will be taking place every other Thursday afternoon. If you're in the area, feel free to drop by our office. If not, you can catch the event live online (and if you miss it, there's always the archives!).

Our first speaker will be Leah Culver, of Pownce, who will be talking about OAuth on November 29th. There's a Facebook page with more details.

Between talks, I'll be broadcasting a bunch of videos that will hopefully be of interest to hackers, starting with the awesome Abelson and Sussman SICP lectures. Come watch them, and chat with other hackers on the hackerTV channel page.

If you have any suggestions for speakers, or for videos to show between talks, please email me: bill@justin.tv

Wednesday, November 7, 2007

JTV Search API

We launched the new justin.tv search engine a few days ago. Now I'm excited to announce that it has an API you can use in your own programs.



We've based the API on some very familiar open standards - basically HTTP and JSON, so using it should be a piece of cake from just about any language. To see an example in Common Lisp, scroll to the end of this post.



To use our search API, you just need to send an HTTP request to http://search.justin.tv:6979/ with a bunch of parameters. Here's a complete list of the parameters that are available right now:




qThe search query (required). Keywords are separated by url-encoded spaces ('+' or '%20'). Anything non-alphanumeric is ignored.
sort-byOne of 'bestmatch' (default), 'newest', 'oldest', 'mostviews'.
pagePage number in results set, default 1.
results-per-pageMaximum 100, minimum 10, default 10.
show-archivesReturn video archives. 'yes', 'true', 'on' all do the same thing.
show-live-broadcastersReturn users who are currently broadcasting. 'yes', 'true', 'on' all do the same thing.
show-offline-broadcastersReturn users who are not currently broadcasting. 'yes', 'true', 'on' all do the same thing.
If none of the above three are specified, they are all assumed to be 'true'.
broadcastersOnly return results from a list of named broadcasters, e.g. 'b1,b2,b3'.
encode-asOne of 'html-fragment' (default), 'html-page', 'json'.


So let's say we want to build a small application, using the search api, that alerts us whenever a new video clip is available that has something to do with cats. Here's a query that would be a good starting point for an application like that:



?q=cat&sort-by=newest&show-archives=true&encode-as=json



Let's pull that apart and look at what each piece does.








q=catWe're interested in search results whose metadata contain the word "cat".
sort-by=newestWe want the most recently produced results.
show-archives=trueWe do want video archives in our results. Note that show-live-broadcasters and show-offline-broadcasters will both default to 'false' because we've set show-archives to 'true'.
encode-as=jsonThe results set should be encoded using json.


Let's send that query to search.justin.tv and see what we get back:



http://search.justin.tv:6979/?q=cat&sortby=newest&show-archives=true&encode-as=json



returns:



[{"type": "video_archive", "broadcaster": "ggjeffy", "id": 28144, "title": "Cat doing cat stuffs", "start_time": 1186542932, "duration": 180}, {"type": "video_archive", "broadcaster": "nekomimi_lisa", "id": 39175, "title": "Nekomimi Cat Doing The Cat Dance", "start_time": 1191805139, "duration": 108}, {"type": "video_archive", "broadcaster": "nekomimi_lisa2", "id": 9832, "title": "CAT FIGHT!!", "start_time": 1185954224, "duration": 180}, {"type": "video_archive", "broadcaster": "nekomimi_lisa2", "id": 9818, "title": "cat trying to hump blanket", "start_time": 1185939617, "duration": 180}, {"type": "video_archive", "broadcaster": "ashleymarie", "id": 36553, "title": "Sister Cat Fight Part 1", "start_time": 1190060150, "duration": 80}, {"type": "video_archive", "broadcaster": "ibrbigottopee", "id": 34698, "title": "I thought i saw a putty cat", "start_time": 1189149636, "duration": 66}, {"type": "video_archive", "broadcaster": "ashleymarie", "id": 36555, "title": "Sister Cat Fight Part 2", "start_time": 1190060349, "duration": 66}, {"type": "video_archive", "broadcaster": "audratv", "id": 39679, "title": "Attack of the Fuzzy Cat Part 2", "start_time": 1191985459, "duration": 180}, {"type": "video_archive", "broadcaster": "audratv", "id": 39656, "title": "Cat Attack", "start_time": 1191980666, "duration": 180}, {"type": "video_archive", "broadcaster": "xk3ll3yx", "id": 40879, "title": "Cat on Head!", "start_time": 1192431728, "duration": 71}]



Now we need to decode that blob of json. Fortunately there's a library to do that for every language under the sun (look on the json page for yours).



So let's see how we would start writing that feline-video-feed in my favorite language, Common Lisp. First we need libraries for doing the http request and the json decoding. trivial-http and cl-json are more than good enough:

(require :trivial-http)
(require :json)

Let's write a "find-cats" function, which will send the http request and return the results as a string:

(defparameter *url*
"http://search.justin.tv:6979/?q=cat&sort-by=newest&show-archives=true&encode-as=json")

(defun find-cats ()
(let ((stream (first (last (trivial-http:http-get *url*))))
(json nil)
(line nil))
(loop while (setf line (read-line stream nil nil)) do
(push line json))
(apply #'concatenate 'string (nreverse json))))

Now we just need to call that function periodically, parse the json, and print any new cat videos:

(defun cats-alert ()
(let ((known-cat-videos (make-hash-table :test #'equalp)))
(loop
(let ((cats (json:decode-json-from-string (find-cats))))
(dolist (cat cats)
(unless (gethash cat known-cat-videos)
(setf (gethash cat known-cat-videos) t)
(format t "New cat video!~%~A~%http://www.justin.tv/~A/~A~%~%"
(rest (assoc 'title cat))
(rest (assoc 'broadcaster cat))
(rest (assoc 'id cat)))
(force-output))))
(sleep 600))))

That's it, we're done! Here's an example of the program's output:

CL-USER> (cats-alert)
New cat video!
Get the cat butt out of the way!
http://www.justin.tv/fistonet/44923

New cat video!
cat vs dog
http://www.justin.tv/midolgirl/44656

New cat video!
scared cat!
http://www.justin.tv/bobtv/43959

New cat video!
Bob scares the cat! lol
http://www.justin.tv/bobtv/43970

New cat video!
Cat fight. Or something.
http://www.justin.tv/ashleyisawesome/43814

New cat video!
Cat on Head - Part II (With New Web Cam)
http://www.justin.tv/shamrox/43719

New cat video!
Cat fight
http://www.justin.tv/nerdette/42933

New cat video!
Dear cat in the middle of the road, i hate you.
http://www.justin.tv/icantstopiwontstop/42878

New cat video!
Jasmine Playing... (& the amazing flying cat-jumps) TOO CUTE!
http://www.justin.tv/lizzymayhem/42798

New cat video!
KT, the cat, running around acting strange, then craps on the floor
http://www.justin.tv/nekomimi_lisa/43225


We can't wait to see what you do with the APIs we're developing. Email me (bill at justin dot tv) if you have something cool to show off, or if you have any questions or comments.

Tuesday, November 6, 2007

Welcome to the Justin.tv Tech Blog

Here we will showcase our awesome tech.