Would you rather do all the previous steps in one fell swoop? While there are a number of preparatory steps required, you can. The Amazon Web Services Elastic MapReduce Command Line Interface (AWS EMR CLI) makes all the previous interactive selections completely scriptable. Amazon provides complete instructions for downloading the CLI and completing all prerequisite steps, including creating an AWS account, configuring credentials and setting up a Simple Storage Service (S3) "bucket" for your log files.
If you're running on Windows, download and install Ruby 1.8.7 (which the EMR CLI relies upon), then download and install the EMR CLI itself. From a Command window (a.k.a. DOS prompt), you'll be able to navigate to the EMR CLI's installation folder and enter a command like the one shown here, which creates an EMR job flow with Hive, Pig and HBase, based on an m1.large EC2 instance.
If you're clever, you can embed all of this in a Windows batch (.BAT) file, and create a shortcut to it on your desktop. Form there, your Hadoop cluster is only a double-click away.
Once the job flow is created, proceed to the EC2 Instances screen as you would have were the job flow created interactively...
Join Discussion