Thursday 18 October 2012

Fwd: One petabyte of data loading into HDFS with in 10 min.

Alright the title is misleading but it's amusing following this thread in the hadoop email list .. 
Surprisingly a lot of the replies have be very helpful (even though they did ask if this question is part of a homework assignment CHUCKLES) 

The original question and further elaboration .. 
hahah comments PLEASE 

Hi Users,
 
Please clarify the below questions.
 
1. With in 10 minutes one petabyte of data load into HDFS/HIVE , how many slave (Data Nodes) machines required.
 
2. With in 10 minutes one petabyte of data load into HDFS/HIVE, what is the configuration setup for cloud computing.
 
Please suggest and help me on this.
 
Thanks&Regards,
P
---------- Forwarded message ----------
From: p K <p  .hadoop@gmail.com>
Date: 10 September 2012 15:40
Subject: Re: One petabyte of data loading into HDFS with in 10 min.
To: user hadoop.apache.org


Hi Users,
 
Thanks for the response.
 

We have loaded 100GB data loaded into HDFS, time taken 1hr.with below configuration.

Each Node (1 machine master, 2 machines  are slave)

1.    500 GB hard disk.

2.    4Gb RAM

3.    3 quad code CPUs.

4.    Speed 1333 MHz

 

Now, we are planning to load 1 petabyte of data (single file)  into Hadoop HDFS and Hive table within 10-20 minutes. For this we need a clarification below.

1. what are the system configuration setup required for all the 3 machine's ?.

2. Hard disk size.

3. RAM size.

4. Mother board

5. Network cable

6. How much Gbps  Infiniband required.

 For the same setup we need cloud computing environment too?

Please suggest and help me on this.

 Thanks,

P.


No comments:

Post a Comment

Datanami, Woe be me