bashreduce: A Bare-Bones MapReduce | Linux Magazine
http://www.linux-mag.com/cache/7407/1.html
heh. maybe useful for learning the mapreduce paradigm?
tFacebook, Hadoop, and Hive | DBMS2 -- DataBase Management System Services
Just wanted to add that even though there is a single point of failure the reliability due to software bugs has not been an issue and the dfs Namenode has been very stable. The Jobtracker crashes that we have seen are due to errant jobs - job isolation is not yet that great in hadoop and a bad query from a user can bring down the tracker (though the recovery time for the tracker is literally a few minutes). There is some good work happening in the community though to address those issues.
I few weeks ago, I posted about a conversation I had with Jeff Hammerbacher of Cloudera, in which he discussed a Hadoop-based effort at Facebook he previously directed. Subsequently, Ashish Thusoo and Joydeep Sarma of Facebook contacted me to expand upon and in a couple of instances correct what Jeff had said. They also filled me in on Hive, a data-manipulation add-on to Hadoop that they developed and subsequently open-sourced.Gearman
# Reverse Worker Code $worker= new GearmanWorker(); $worker->addServer(); $worker->addFunction("reverse", "my_reverse_function"); while ($worker->work()); function my_reverse_function($job) { return strrev($job->workload()); }
Gearman provides a generic application framework to farm out work to other machines or processes that are better suited to do the work. It allows you to do work in parallel, to load balance processing, and to call functions between languages. It can be used in a variety of applications, from high-availability web sites to the transport of database replication events. In other words, it is the nervous system for how distributed processing communicates.
language independent worker frameworkXtreemFS - file systems for the masses - a replicated and distributed file system for the internet and cloud storage
Interesting to see the problems that are present at Google regarding staying in sync with the latest kernel code.Remus
Remus provides transparent, comprehensive high availability to ordinary virtual machines running on the Xen virtual machine monitor. It does this by maintaining a completely up-to-date copy of a running VM on a backup server, which automatically activates if the primary server fails.
HA solution for Xen実録、ほぼ無停止なMySQLのフェイルオーバ (動画もあるよ) - (ひ)メモ
keepalived --vrrp で、マルチマスターフェイルオーバーするHadoop - YDN
"Apache Hadoop* is an open source Java software framework for running data-intensive applications on large clusters of commodity hardware."
Hadoop and Distributed Computing at Yahoo!hazelcast - Project Hosting on Google Code
Hazelcast is a clustering and highly scalable data distribution platform for Java. Features: Distributed implementations of java.util.{Queue, Set, List, Map} Distributed implementation of java.util.concurrency.locks.Lock Distributed implementation of java.util.concurrent.ExecutorService Distributed MultiMap for one-to-many relationships Distributed Topic for publish/subscribe messaging Transaction support and J2EE container integration via JCA Socket level encryption support for secure clusters Synchronous (write-through) and asynchronous (write-behind) persistence Second level cache provider for Hibernate Monitoring and management of the cluster via JMX Dynamic HTTP session clustering Support for cluster info and membership events Dynamic discovery Dynamic scaling Dynamic partitioning with backups Dynamic fail-over
Hazelcast is a clustering and highly scalable data distribution platform for Java.
Hazelcast is a clustering and highly scalable data distribution platform for Java. Features: * Distributed implementations of java.util.{Queue, Set, List, Map} * Distributed implementation of java.util.concurrency.locks.Lock * Distributed implementation of java.util.concurrent.ExecutorService * Distributed MultiMap for one-to-many relationships * Distributed Topic for publish/subscribe messaging * Transaction support and J2EE container integration via JCA * Socket level encryption support for secure clusters * Synchronous (write-through) and asynchronous (write-behind) persistence * Second level cache provider for Hibernate * Monitoring and management of the cluster via JMX * Dynamic HTTP session clustering * Support for cluster info and membership events * Dynamic discovery * Dynamic scaling * Dynamic partitioning with backups * Dynamic fail-over Hazelcast is for you if you want to * share data/state among many s
data distribution platformHadoop Live CD at OpenSolaris.org
OpenSolaris Project: Hadoop Live CDKazuho@Cybozu Labs: Pacific という名前の分散ストレージを作り始めた件
Coming out of stealth, SeaMicro is dispelling the Silicon Valley myth that you can’t innovate in hardware anymore. The startup is announcing today it has created a server with 512 Intel Atom chips that gets supercomputer performance but uses 75 percent less power and space than current servers.
New 512-core servers http://venturebeat.com/2010/06/13/seamicro-drops-an-atom-bomb-on-the-server-industry
(private) cloud in a box? http://bit.ly/dBXVTr
@LarsBoeNielsen speaking of going crazy, check out this http://bit.ly/9uxBNh Server V.Next? and yes, i want the new XBox!