in mis510 project, your team is required to create a web business, with a complete web site and...

Post on 23-Dec-2015

217 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Tutorial for Web Mining Project

-cloud computing platform

Introduction

In mis510 project, your team is required to create a web business, with a complete web site and business functionalities for specific customers, using either Google App Engine or Amazon EC2 platform.

Since Google App Engine and Amazon EC2

have distinct interfaces, service features and pricing policies, this tutorial gives instructions of how to use these platforms respectively.

Google App Engine Tutorial

Written by Jonathan Jiang

updated by Julian Guo

Overview

A cloud platform for publishing web application .

Simple, web-based application management console.

Developers can focus on application logic, no need to worry about hardware ,system administration, scalability etc.

Support Java, Python, and Go.

Guideline

0. Preparation 1. Create a Google Web Application

Project 2. Debug, Run and Deploy 3. Interaction with User 4. Use Cloud Database 5. Pricing

0.Preparation

0.1. Sign up a Google App Engine account: https://appengine.google.com/start

0.2. Download App Engine SDK http://code.google.com/appengine/downloa

ds.html 0.3. For Java/Eclipse users, it is

recommended to download Eclipse Plugins to build, debug and deploy your application. http://code.google.com/eclipse/docs/downlo

ad.html

0.1. Sign up a Google App Engine account:

You need to login to your gmail account to see this page. Sometimes, your xxx@email.arizona.edu account does not work. If so, sign up a new one.

This is your application ID, write it down.

Steps 0.2 and 0.3

Steps 0.2 and 0.3 can be combined in Eclipse: Help->Install New

Software Type

https://dl.google.com/eclipse/plugin/3.7 in the “Work with” and press Enter.

Then choose the required packages and download them.

These are required. Others are optional

1.Create a Google Web Application Project

1.1 Create a New Project

Now you should be abele to create a Google AppEngine project in Eclipse

New->Web Application Project

Type the project name and package you like, then choose the Google SDKs you want to use. Typically you only need ‘Use Google App Engine’for your SDK.

1.2 File Structure of the Web Application

src/ includes all source files for your application.

Java source codes

META-INF/ includes other configuration files

WEB-INF/ includes used libraries, compiled classes and configuration files.

Images, data, HTML and JSP files are put directly under /war folder.

war/ includes all the files that are deployed and actually used on the server.

1.2 File Structure of the Web Application

In WEB-INF folder, there are two configuration files.

o appengine-web.xmlo web.xml

The first five lines of appengine-web.xml looks like

Don’t forget to add your registered application ID between <appliction> tags.

Web.xml is SUPER IMPORTANT. It is mainly responsible for mapping URIs to your servlet classes and web pages (Examples are provided later.)

<?xml version="1.0" encoding="utf-8"?><appengine-web-app xmlns="http://appengine.google.com/ns/1.0">    <application>your application ID</application>    <version>1</version></appengine-web-app>

2.Debug, Run and Deploy the Web application

2.1.Debug and Run

Eclipse plugin has already created a Hello World example for you. You can directly run your project and test if it works. Right click on the project folder-> Debug As Web

Application. In Debug mode, Google App Engine will

create a server on your local machine, and your project will run on that local server. If it is running successfully, the console will display a line like:

If you use Eclipse, the server is running at http://localhost:8888/ You can open a web browser and paste the link above to test

you project.

2.1.Debug and Run

When the server is running in debug mode, any changes to your project files should be automatically detected by Google App Engine, so you don’t have to rebuild the project (but still you need to refresh the browser to see the changes). *Don’t over-trust this statement. When you always encounter the

same error, it is very likely that just rebuilding the project will help you out.

An exception is web.xml. If you make changes to it, you must rebuild your project.

2.2.Deploy

When you are satisfied with your application, you can deploy it to the cloud environment Google provides so that users all over the world have access to it.

Simply click the ‘Deploy’icon, and enter your account information for the AppEngine Account.

Now you can visit your application athttp://your-applicationID.appspot.com

3. Interaction with User

3. Interaction with User

Often, you want your application not only to present static information, but also to interact with users.

Your system needs to pass user inputs from web pages to your Java or Python program.

Here we provide a JSP/Java example of a movie related web mining application. This example returns movie’s plot based on the movie name given by users.

Interface

Web Mining

Component (Server

Side Logic)Output

User Input

Your ApplicationWeb Pages/

API

3.1 Receive User Input

Create form_input.jsp, add the following lines between the <body> </body> tags.

When the user visits form_input.jsp. It will show a field for input:

You want to pass the input to your Java Servlet application (your background program), say, SampleServlet.java

Input a movie name here:<form action="/processinput" method="post"> <div><input type="textarea" name="moviename" rows="3" cols="60"></div><div><input type="submit" value="Submit" /></div></form>

3.1 Receive User Input

You need to configure web.xml to let the system know how to map the form submission URI to the appropriate Java class. The following example shows such a mapping: http://your application ID.appspot.com/processinput

SampleServlet.class

<?xml version="1.0" encoding="utf-8"?><web-app xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xmlns="http://java.sun.com/xml/ns/javaee"xmlns:web="http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd"xsi:schemaLocation="http://java.sun.com/xml/ns/javaeehttp://java.sun.com/xml/ns/javaee/web-app_2_5.xsd" version="2.5"><servlet><servlet-name>Sample</servlet-name><servlet-class>mis510.SampleServlet</servlet-class></servlet><servlet-mapping><servlet-name>Sample</servlet-name><url-pattern>/processinput</url-pattern></servlet-mapping><welcome-file-list><welcome-file>index.html</welcome-file></welcome-file-list></web-app>

3.2 Process Use Input

Copy the following code to SampleServlet.java Use req.getParameter() method to obtain the user input (movie

name) and process it in SampleServlet.java. An external API is used to retrieve the movie’s plot from web.

package mis510;import java.io.IOException;import javax.servlet.ServletException;import javax.servlet.http.*;import myUtility.IMDB_Handler;public class SampleServlet extends HttpServlet { public void doPost(HttpServletRequest req, HttpServletResponse resp) throws IOException {

String movieName = req.getParameter("moviename"); //the following code retrieves the input movie’s plot using an external web API IMDB_Handler imdbAPI=new IMDB_Handler();

try{ String movieID=imdbAPI.convert(movieName); //convert movie name to its IMDB ID. String result=imdbAPI.getPlot(movieID); //get the movie plot

req.setAttribute("result", result);// the resulting movie plot is saved in a variable “result”

req.getRequestDispatcher("form_input.jsp").forward(req, resp);} //return the user back to original page

catch (Exception e) {e.printStackTrace();} }}

3.2 Process Use Input

Here’s a snippet of the API use code. The complete sample code is given in ‘samplecode.rar’.

public class IMDB_Handler {static private String EndPoint="http://imdbapi.com/";public IMDB_Handler(){}

/*public static String convert(String moviename){} */public String getPlot(String movieID){

String RESTurl,plot; RESTurl=EndPoint+"?id="+movieID; try{ HTTPProxy handler=new HTTPProxy(); String content=handler.GetContent(RESTurl); JSONObject jobj=new JSONObject(content); plot=jobj.getString("plot_simple");} catch(Exception e) {plot=“The API server is currently down.";} return plot; }}

3.3 Return the Output to User

Now you can display the results to user by adding a line to the designated jsp page. In this example, we use the same jsp page as user input. Now the form_input.jsp should look like:<body>

Input a movie name here: <form action="/processinput" method="post"> <div><input type="textarea" name="moviename" rows="3" cols="60"></div> <div><input type="submit" value="Submit" /></div> </form>

<%=request.getAttribute("result") %> <%--add this line to display the value in "result“--%>

</body> Try it in http://localhost:8888/form_input.jsp

4. Use Cloud Database

4. Use Cloud Database

Situations where using cloud database may help: Remember user activities. Store the results of web mining process to

speed up next inquiry. Upload a large file which is a component of

your application. ….

In next slides we show an example of using Google Datastore to save and retrieve users’ comments for movies.

4. Use Cloud Database

Updating the form_input.jsp to receive user comments:

<form action="/processinput" method="post"> Input Movie Name Here: <div><input type="text" name="moviename" rows="3" cols="60"></div>Input Your Name Here: <div><input type="text" name="username" rows="3" cols="60"></div> Type Your Comment Here: <div><textarea name="comment" rows="3" cols="60"></textarea></div> <div><input type="submit" value="Submit" /></div> </form>

Movie Plot: <br><%=request.getAttribute("result") %>

4.1 Google Datastore

4.1.1 Store Comments Add this component to SampleServelet.java (For complete sample, please refer to samplecode.rar)

//Store the user comments: Key movieKey = KeyFactory.createKey("MovieComment", movieName); String content = req.getParameter("comment"); String username = req.getParameter("username"); Date date = new Date(); Entity comment = new Entity("Comment", movieKey); comment.setProperty("user", username); comment.setProperty("date", date); comment.setProperty("content", content);

DatastoreService datastore = DatastoreServiceFactory.getDatastoreService(); datastore.put(comment);

try {req.getRequestDispatcher("form_input.jsp?moviename="+movieName).forward(req, resp);} catch (ServletException e) {e.printStackTrace();} catch (IOException e) {e.printStackTrace();}

4.1 Google Datastore

4.1.1 Store Comments Modify SampleServelet.java as: (For complete sample, please refer to samplecode.rar)

package mis510;import java.io.IOException;import java.util.Date;import javax.servlet.ServletException;import javax.servlet.http.*;import com.google.appengine.api.datastore.DatastoreService;import com.google.appengine.api.datastore.DatastoreServiceFactory;import com.google.appengine.api.datastore.Entity;import com.google.appengine.api.datastore.Key;import com.google.appengine.api.datastore.KeyFactory;import myUtility.IMDB_Handler;public class SampleServlet extends HttpServlet { public void doPost(HttpServletRequest req, HttpServletResponse resp) throws IOException {String movieName = req.getParameter("moviename"); IMDB_Handler imdbAPI=new IMDB_Handler();try{ String movieID=imdbAPI.convert(movieName); //convert movie name to its IMDB ID. String result=imdbAPI.getPlot(movieID); //get the movie plot req.setAttribute("result", result);// the resulting movie plot is saved in a variable “result”

4.1 Google Datastore

4.1.1 Store Comments Modify SampleServelet.java as (cont’d): (For complete sample, please refer to samplecode.rar)

//Store the user comments: Key movieKey = KeyFactory.createKey("MovieComment", movieName); String content = req.getParameter("comment"); String username = req.getParameter("username"); Date date = new Date(); Entity comment = new Entity("Comment", movieKey); comment.setProperty("user", username); comment.setProperty("date", date); comment.setProperty("content", content);

DatastoreService datastore = DatastoreServiceFactory.getDatastoreService(); datastore.put(comment);

req.getRequestDispatcher("form_input.jsp?moviename="+movieName).forward(req, resp);}catch (Exception e) {e.printStackTrace();} }}

4.1 Google Datastore

4.1.1 Retrieve Comments Add this component to form_input.jsp, before <html> (For complete sample, please refer to samplecode.rar)

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><%@ page import="java.util.List" %><%@ page import="com.google.appengine.api.datastore.DatastoreServiceFactory" %><%@ page import="com.google.appengine.api.datastore.DatastoreService" %><%@ page import="com.google.appengine.api.datastore.Query" %><%@ page import="com.google.appengine.api.datastore.Entity" %><%@ page import="com.google.appengine.api.datastore.FetchOptions" %><%@ page import="com.google.appengine.api.datastore.Key" %><%@ page import="com.google.appengine.api.datastore.KeyFactory" %>

4.1 Google Datastore

4.1.1 Retrieve Comments Add this component to form_input.jsp, inside <body> and

</body>. (For complete sample, please refer to samplecode.rar)

<% String movieName = request.getParameter("moviename");if(movieName==null)movieName="default"; DatastoreService datastore = DatastoreServiceFactory.getDatastoreService(); Key movieKey = KeyFactory.createKey("MovieComment", movieName);

Query query = new Query("Comment", movieKey).addSort("date", Query.SortDirection.DESCENDING); List<Entity> comments = datastore.prepare(query).asList(FetchOptions.Builder.withLimit(5)); if (comments.isEmpty()) { %> <p>The movie '<%= movieName %>' has no comments posted yet.</p> <% }

4.1 Google Datastore

//cont’d

else { %> <p>Comments for movie '<%= movieName %>'.</p> <% for (Entity comment : comments) { if (comment.getProperty("user") == null) { %> <p>An anonymous person wrote:</p> <% } else { %> <p><b><%= comment.getProperty("user") %></b> wrote:</p> <% } %> <blockquote><%= comment.getProperty("content") %></blockquote> <% } }%>

4. Use Cloud Database

Advantages of Google Datastore: Google provides data management capacity for

you. Very Flexible (schemaless) Option to view & manage the data online

Login to Google App Engine:https://appengine.google.com/, choose your application-> Datastore Viewer

Disadvantages: Limit of 1GB free data storage quota, compared

to Amazon EC2(10GB). Only for small data object(entity) in Datastore.

To store larger data, Google Blobstore can be used. http://code.google.com/appengine/docs/java/blobstore/ov

erview.html

5. Cost

Resource Daily Limit

Frontend Instance Hours 28hr

High Replication Datastore Storage

1 GB

Datastore Reads 50k ops

Datastore Writes 50k ops

Outgoing Network Traffic 1 GB

Incoming Network Traffic 1 GB

Google App Engine sets a resource usage quota for free application.

Free Quota for Major Resources

For more details:https://cloud.google.com/products/app-engine/#pricing

5. Pricing

Resource Rate

Frontend Instance Hours $0.08/hour

Datastore Amount $0.18/GB/month

Datastore Reads $0.06/100k read ops

Datastore Writes $0.09/100k write ops

Outgoing Network Traffic $0.12/GB

Incoming Network Traffic Free

Billing Rate for Major Resources

For resource usage exceeding the quota, Google charges at the price rates below.

5. Pricing

Costs vary greatly depending on different resource usage. The following table lists a rough estimation of daily costs for typical apps: App 1 App 2 App 3

Data store 1GB 10GB 10GB

Bandwidth in&out

1GB 1GB 5GB

Cost Free $5/day $15/day

5.Pricing

Suggestions for reducing cost. Login to App Engine Console and set daily budget.

Reduce instance hours Datastore is expensive Debug on your local server most of the time

(completely free!). Deploy the full version of your app only during last weeks of the mis 510.

Applying these suggestions will reduce the cost for projects.

This is the safest way to control your cost, but resource usage exceeding this budget will not be allowed (so your app throw errors.)

Amazon EC2 Tutorial

Written by Julian Guo

Amazon Elastic Compute Cloud (Amazon EC2)

Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides resizable compute capacity in the cloud. It is designed to make web-scale computing easier for developers.

simple web service interface complete control of your computing resources fast obtain and boot new server instances quickly scale capacity as your computing

requirements change pay only for capacity that you actually use

Tutorial Guideline

1. Sign up EC2 2. Launch an Instance 3. Connect to Windows Instance 4. Connect to Unix/Linux Instance 5. Application Example 6. Pricing 7. Resources

1. Sign Up EC2

Sign up an Amazon EC2 Account: http://aws.amazon.com/ec2/ If you have an Amazon Shopping Account, just

use this account.

2. Launch an Instance

Sign in AWS Management Console (choose EC2):

http://aws.amazon.com/console/

AWS Management Console

Create and Download a Key Pair

A key pair is a security credential similar to a password, which you use to securely connect to your instance after it's running.

Choose an Amazon Machine Image (AMI)

Amazon Linux Windows Server 2008 with SQL Server Red Hat/Ubuntu/Debian Linux

Just like choosing a virtual machine You can choose 64-bit or 32-bit machines Prices for different machines are different

Configure Firewall (create a security group)

Create rules to get access to instanceFor a windows server, we need HTTP port 80, MS SQL port 1433, Remote Desktop port 3389 and HTTP 8080 (for Tomcat).For Linux, we need SSH to login (to use PuTTY and WinSCP).

3. Connect to Windows Instance

Go to the AWS Management Console and locate the instance on the Instances page.

Right-click the instance and select Get Windows Password.

Use Remote Desktop to login with Decrypted Password

Get an elastic IP (static IP)

Click “Elastic IP” in “Navigation” Click “Allocate New Address” Associate Address to your instance Elastic Address is desirable resource. You should release the address,

if you don’t want to associate it to any instance. Otherwise, Amazon will charge you money!

Login with Elastic IP

Get a Windows Server!

Manage and Control the Server

Stop = Shutdown computerReboot = Restart computerTerminate = throw away your computer!

You can monitor your instance in AWS management console

4. Connect to Unix/Linux Instance

Install PuTTY on your windows machine

Start PuTTYgen (e.g., from the Start menu, click All Programs > PuTTY > PuTTYgen).

Click Load and browse to the location of the private key file that you want to convert (e.g., hello.pem) into hello.ppk.

Save hello.ppk somewhere.

Use PuTTY to connect

Open PuTTY Use Public DNS

as hostname Use root (Red-

Hat), bitnami (Ubuntu), ec2-user (Amazon Linux) as username

Click SSH->Auth to load the.ppk file

Example: Login a Red Hat System

Use WinSCP to connect

Install WinSCP on your windows machine

Use Public DNS as hostname

Use root (Red Hat), bitnami (Ubuntu), ec2-user (Amazon Linux) as username

Load .ppk file (get it from PuTTYgen)

Click login

Example: Login a Red Hat System

5. Application Example (deploy my last year project)

Use a Micro On-Demand Instances Run a Windows Server 2008 with elastic IP SQL Server 2008 R2 is ready Install Firefox, Java JRE, Tomcat 7.0

(server), Eclipse IDE, Dropbox (for data transmission).

Deploy my web application on this server (run a Tomcat server on Eclipse)!

Get access to web application via HTTP port 8080.

Tomcat on Eclipse

6. Pricing

Pay only for what you use. There is no minimum fee.

See Details: http://aws.amazon.com/ec2/pricing/

Estimate your monthly bill using AWS Simple Monthly Calculator.

You might pay for: EC2 instances Elastic IP Data Transfer (In and Out) Amazon EBS Storage

Amazon EC2 Instance Purchasing Options

Amazon EC2 provides customers three different purchasing models that give you the flexibility to optimize your costs.

On-demand Instances: pay for compute capacity by the hour

Reserved Instances: one-time payment (1 year term, 3 year term), cheaper than on-demand instance

Spot instances: bid for unused Amazon EC2 capacity (can be every cheap if having good bidding strategy)

For a typical MIS 510 project, you might pay $20-30 in total. You can prepare codes on local platforms, and just deploy project code on EC2 for 1-2 weeks. To save running hours, you can shut down EC2 in the night.

Notice: Price also varies in different regions

Appendix. Resources

Documentations and tutorials: http://code.google.com/appengine/docs/ http://aws.amazon.com/documentation/ec2/

Google App Engine main page: http://code.google.com/appengine Amazon AWS main page: http://aws.amazon.com/

AL lab’s resource for MIS 510: http://ai.arizona.edu/mis510/

top related