State-of-the-art collaborative filtering and graph mining algorithms.
Your first step into the Giraph world!
Designed to be used in parallel with our tools.
No need to be a Giraph expert!
Adapt the library to your needs and contribute!
Join our community.
In many cases, your graph is dynamic and you need your analytical tools to be able to handle near real-time changes.
However, a tool like Giraph is designed for batch processing on static data, much like Hadoop. Running Giraph jobs continuously on large graphs to keep your analytics up-to-date can become computationally expensive or lead to slow responses.
On the flip side, writing custom algorithms for dynamic graphs is a hard and time consuming programming task and may even be challenging for most programmers.RT-Giraph keeps your graph analytics up-to-date efficiently and with little programming effort. It runs existing Giraph algorithms, with the added benefit that whenever the graph changes, RT-Giraph can update the changes seamlessly, resulting in faster computation than re-running algorithms from scratch.
Learn more About RT-Giraph
You don't have to be a Giraph expert to run your first job. You only need to have a working Hadoop setup. After that, just use our tools!
Test.fm is (yet another) testing framework for Collaborative Filtering models. Use python to automate your test environment.
Do you want to run it on a YARN cluster? Do you use Cloudera or Hortonworks? We have you covered. We've packed the right jar for you!
#prepare the data df = pd.read_csv(..., names=['user', 'item', 'rating', 'date', 'title']) training, testing = testfm.split.holdoutByRandom(df, 0.9) #tell me what models we want to evaluate models = [ RandomModel(), Popularity(), TensorCoFi(), ] #evaluate items = training.item.unique() for m in models: m.fit(training) print m.getName().ljust(50), print testfm.evaluate_model(m, testing, all_items=items)
Associate Researcher at Telefonica Digital. Works on recommendation systems (contextual if possible) and fond of seeing how projects get alive. In love with mountains.
A distributed systems guy by birth. He contributes to the Okapi library, and builds RT-Giraph. He is an Associate Researcher @Telefonica Digital.
Snowboarder, urban cyclist and passionate cook, in his spare time builds Machine Learning and Recommender Systems algorithms.
Deep diving into Big Data Systems. Georgos was a Researcher at Telefonica Digital. He is currently a Senior Scientist at Qatar Computing Research Institute.
Maria the geek girl! She implemented the basis of Okapi during her master thesis @Telefonica Digital.
Ilias is a Research Associate at Telefonica Digital. In the past he was a researcher at University of Cambridge and received his PhD from University College London (UCL).
With a passion for code ninjaing and all things distributed, Alex contributes to Okapi and RT-Giraph in the context of his master Thesis (European Master on Distributed Computing @UPC and KTH).
João mainly contributes to the test.fm, implementing the framework structure and connecting it to Okapi.
Claudio is a graph fetishist, member of the LSDS group @ VU University Amsterdam, and Committer and PMC member of Apache Giraph.
Vasia is a PhD student at KTH, Sweden and UCL, Belgium. She is interested in data-intensive frameworks, currently focusing on graph processing. She has a M.Sc. in Distributed Computing from UPC, Barcelona and KTH.
Self-described as a Web Janitor, is a PhD candidate at the University of British Columbia, Vancouver, Canada. He is currently building scalable defense systems to fight against the bad guys on the Web.
Research Scientist at Yahoo Labs Barcelona working at the intersection between data mining and distributed systems. Passionate about cooking and open-source, he contributes to Apache Pig, Giraph, S4, and Hadoop.
PhD Candidate at the Australian National University. Love building scalable machine learning systems for solving practical problems.