07-01-2014, 10:47 PM
Kakve cifre, samo da vam se zavrti u glavi ~~~
http://wiki.apache.org/hadoop/PoweredBy
EBay
532 nodes cluster (8 * 532 cores, 5.3PB).
Heavy usage of Java MapReduce, Apache Pig, Apache Hive, Apache HBase
Using it for Search optimization and Research.
Facebook
We use Apache Hadoop to store copies of internal log and dimension data sources and use it as a source for reporting/analytics and machine learning.
Currently we have 2 major clusters:
A 1100-machine cluster with 8800 cores and about 12 PB raw storage.
A 300-machine cluster with 2400 cores and about 3 PB raw storage.
Each (commodity) node has 8 cores and 12 TB of storage.
We are heavy users of both streaming as well as the Java APIs. We have built a higher level data warehousing framework using these features called Hive (see the http://hadoop.apache.org/hive/). We have also developed a FUSE implementation over HDFS.
Yahoo!
More than 100,000 CPUs in >40,000 computers running Hadoop
Our biggest cluster: 4500 nodes (2*4cpu boxes w 4*1TB disk & 16GB RAM)
Used to support research for Ad Systems and Web Search
Also used to do scaling tests to support development of Apache Hadoop on larger clusters
Our Blog - Learn more about how we use Apache Hadoop.
>60% of Hadoop Jobs within Yahoo are Apache Pig jobs.
http://wiki.apache.org/hadoop/PoweredBy
EBay
532 nodes cluster (8 * 532 cores, 5.3PB).
Heavy usage of Java MapReduce, Apache Pig, Apache Hive, Apache HBase
Using it for Search optimization and Research.
We use Apache Hadoop to store copies of internal log and dimension data sources and use it as a source for reporting/analytics and machine learning.
Currently we have 2 major clusters:
A 1100-machine cluster with 8800 cores and about 12 PB raw storage.
A 300-machine cluster with 2400 cores and about 3 PB raw storage.
Each (commodity) node has 8 cores and 12 TB of storage.
We are heavy users of both streaming as well as the Java APIs. We have built a higher level data warehousing framework using these features called Hive (see the http://hadoop.apache.org/hive/). We have also developed a FUSE implementation over HDFS.
Yahoo!
More than 100,000 CPUs in >40,000 computers running Hadoop
Our biggest cluster: 4500 nodes (2*4cpu boxes w 4*1TB disk & 16GB RAM)
Used to support research for Ad Systems and Web Search
Also used to do scaling tests to support development of Apache Hadoop on larger clusters
Our Blog - Learn more about how we use Apache Hadoop.
>60% of Hadoop Jobs within Yahoo are Apache Pig jobs.