Tag Archives: twitter

[video] – The Filter Bubble – From Human to Algorithmic Gatekeepers

Web personalization and personalized recommendations are recently gaining more and more interest. Companies like Amazon, Google, Netflix, The New York Times, Facebook, Twitter, … already personalize their products in different ways. If you take Google’s search results as an example. Have you ever noticed that a friend of you gets different search results as you do for the same search query? If you never have noticed just try it out it’s really worth noting. Another example are Amazon’s product recommendations which are for example based on your purchases, your product ratings and so on.

Eli Pariser explains in the following TED Talk how “human information filters” get substituted by algorithmic ones, which means how recommendation engines filter information for you. Have a look at the video is is really worth watching:

Do you know other examples of web personalization or recommendations engines? Please leave me a comment at the end of this post.

Enhanced by Zemanta

[video] – How is Hadoop used at Twitter?

Image representing Hadoop as depicted in Crunc...

Image via CrunchBase

In the following video Dmitriy Ryaboy, a Twitter Analytics Engineer and a former Cloudera Intern, explains how Twitter uses Hadoop and Pig. Enjoy the video and have a good weekend!

Enhanced by Zemanta

[video] – HBase and Pig: The Hadoop Ecosystem at Twitter

I have just found this very interesting video dealing with the implementation of HBase and Pig in combination with Hadoop at Twitter:

Enhanced by Zemanta

Grabeeter – Grab and Search your Tweets offline and online

Grabeeter - Grab and Search your Tweets

Grabeeter (@grabeeter) has just been launched and I think it is a very useful tool.

Grabeeter enables you to grab your tweets which means that you are able to store your tweets on your local harddrive in a structured format (xml at the time). Using the Grabeeter Client you are also able to perform searches on your local stored tweets.

This solves in parts the following problem. Maybe some of you have already noticed that if you have written more than 3200 Tweets you are not able to access your first tweets anymore due to Twitter access restrictions.

If you register on Grabeeter before you have reached the 3200 Tweets on Twitter you are able to access all your written tweets in the future. Grabeeter archives your tweets and enables you to export them in a structured format (XML and JSON at the time). You can also use Grabeeter Client to directly access your interesting tweets on Twitter again.

Just go to Grabeeter and register with your twitter username. All the other work is done for you. You can afterwards export and search your tweets online or using Grabeeter Client offline.

For all developers we also provide a small Grabeeter API in order to access your tweets using an application you developed.

Have fun and we are happy to get feedback from you.

Related articles

Enhanced by Zemanta

Twitter’s use of Cassandra, Hadoop, Pig and HBase for highly distributed Data Processing and Analysis

Kevin Weil, Analytics Lead at Twitter recently gave a presentation on Twitter’s use of Cassandra, Pig and HBase. Specially interesting is how Twitter uses Hadoop and Pig in their data analysis process.

(via @kevinweil)

Another great presentation from Tobias Ivarsson gives an overview on NoSQL:

(via @thobe)

Reblog this post [with Zemanta]