the history of java: a case · why is the jar standard better? simplicity / extensibility: crypto...

42

Upload: others

Post on 27-May-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)
Page 2: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

The History of Java: A case study in software engineeringDan S. Wallach and Mack Joyner, Rice University

Copyright © 2016 Dan Wallach, All Rights Reserved

Page 3: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

The Beginnings

Sun Microsystems (popular Unix system vendor) research project: 1990: Design goal: programming for small devices

• Set-top boxes, interactive television, PDAs • Single-digit MHz CPUs, single-digit MBytes of RAM, etc.

1991-1993: James Gosling and Patrick Naughton develop the “Oak” programming language 1993: The Web is taking off, Oak ➜ Java for web applets

Page 4: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

NCSA Mosaic

Written by UIUC undergrads Who founded Mosaic Communications Corp in 1993, later renamed Netscape

They started over, created Netscape Navigator, the basis of today’s Firefox.

Page 5: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

HotJava (1995)

Web browser, written in Java. Supported Java “applets” that ran inside the browser

• “Mobile code”, not just HTML • This happened before

JavaScript!

Late 1995, Netscape announced it would license Java.

Page 6: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

Java language principles

Familiar C / C++ syntax (curly braces, etc.) But really, closer to languages like Modula-2, etc.

• No pointer arithmetic: references to objects are opaque • No operator overloading • No multiple inheritance of classes • No preprocessor • No explicit memory deallocation • No global functions - everything in a class

Page 7: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

Nice features of Java, all there in day 1Garbage collector

“Everything is an object” hierarchy

Type safety (vs. C which lets you pretend any memory has any type)

Bounds checking on arrays (vs. C which doesn’t care)

Real strings (vs. C’s array of characters)

Package system

Interfaces

Exceptions for error handling (vs. no standard error handling in C)

Multithreading with language support for locking and synchronization

JavaDoc documentation system

Page 8: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

Java is C++ without the guns, knives, and clubs. - James Gosling

Page 9: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

Java underlying design

Java compiler: emits machine-independent “bytecode” (.class files) Stack-machine approach (push, pop, etc., just like your RPN calculator) Standardized types Portable: write once, run anywhere

Bytecode is executed on a Java “virtual machine” All objects are allocated on the heap, not on the stack Interpreter was small, performance was “good enough”

Page 10: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

Java security?Java wasn’t initially meant to be “secure” for running untrusted code Security properties were grafted on with HotJava

Bytecode verifier: reject “malicious” bytecode Example: verify that you never call a “private” method from outside

Dean, Felten, Wallach (’96): security was easy to break

Page 11: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

Java security?Java wasn’t initially meant to be “secure” for running untrusted code Security properties were grafted on with HotJava

Bytecode verifier: reject “malicious” bytecode Example: verify that you never call a “private” method from outside

Dean, Felten, Wallach (’96): security was easy to break

Today, Java in the browser for untrusted code is dead. Java is

widely used for server apps, Android apps, and other places where

malicious code isn’t a problem.

Page 12: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

1996

JDK (Java Development Kit) 1.0 was released

Compiler and runtime for Solaris, Linux, Windows, Mac OS

Java was a huge hit with industrial programmers and in CS education

Sun created JavaSoft (30-40 employees in 1996) Budimlić (co-taught Comp215 last year) interned at JavaSoft

Netscape was shipping Java to millions of users Wallach interned at Netscape

Page 13: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

Engineering challenges for security

Proposal #1 (Mark Miller and others, “E” language) Let’s redo all the Java libraries to improve security! “Capability” style libraries have useful security properties

• No public constructors or static methods • Instead, somebody passes you a “filesystem” or a “network” at the

beginning, and you query it

Rejected: Too many people already using the Java libraries as-is

Page 14: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

Adding security to JavaOriginal hack: look at the Java call stack If the stack is “thick” enough with system code, then it must be fine

Netscape improvement: stack annotations System code “enables” security-critical APIs only when necessary If security isn’t enabled, APIs will fail

But the compiler writers hate stack annotations Restricts their ability to do performance optimizations

Modern solutions don’t do this (take Comp427, learn more later!)

Page 15: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

Engineering challenge: startup latency

Java 1.0: each .java file becomes a separate .class file Each file has to be downloaded as a separate web request Modestly complex Java applets took over a minute to load

Solution: put everything in a zip file, extend the <applet> tag Implemented on the side by one Netscape engineer (Warren Harris) Didn’t ask anybody for permission, just did it (shipped in Netscape 3.0) No backward compatibility issues

Page 16: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

Engineering challenge: adding signaturesGoal: Add a “digital signature” scheme to Java “Signed applets” could have more security privileges

Challenge: Where to add the crypto without breaking things? Meeting at JavaSoft: three Netscape engineers + three JavaSoft engineers (including Wallach!)

Failed idea: shoehorn crypto into the .class files Backward compatibility? Makes it hard to evolve the class file spec

Final idea: add a signatures subdirectory to the .zip file Netscape’s hack became a real standard and grew additional features

• A “jar” file is just a “zip” file with a particular layout on the inside

Page 17: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

Why is the Jar standard better?Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

Orthogonality: Java .class file format is unrelated to the crypto If the class file format changes, crypto doesn’t care If you don’t care about crypto, you can just ignore the crypto

Performance: faster downloads (single HTTP get)

Fun fact: The Jar spec defined two different crypto algorithms And, of course, Netscape implemented one, JavaSoft the other one

Page 18: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

Other Java features trickle inJDK 1.1 (1997): Java gets lots more useful for general programming! inner/anonymous classes, RMI, reflection, JIT compiler

JDK 1.2 (aka Java2) (1998): New libraries! updated security model (joint with Netscape) java.util “collections” classes “strictfp” updates (useful for scientific computing)

Java3 (2000): Performance! HotSpot JVM (radically improved performance)

Java4 (2002): More new libraries!

Page 19: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

Meanwhile at MicrosoftInternet Explorer 3 (1996) had its own version of Java Reimplemented from scratch inside Microsoft Tweaked the language New libraries

• Windows-only!

“Embrace and extend” Sun sued Microsoft (1997)

Page 20: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

Meanwhile at MicrosoftInternet Explorer 3 (1996) had its own version of Java Reimplemented from scratch inside Microsoft Tweaked the language New libraries

• Windows-only!

“Embrace and extend” Sun sued Microsoft (1997)

Page 21: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

Microsoft’s solution (in 2000): C#C# is sort of Java with reliability, productivity, and security deleted. - James Gosling

Actually, C# learned a lot of lessons from Java Arrays of objects are contiguous in memory: better for performance Common Language Runtime (CLR) supported VisualBasic, others Built from the beginning to be compiled Better integration with Microsoft COM Much simpler for writing a new Windows program

Microsoft Research got involved in C# security before it shipped

C#, F#, Spec#, and others: Microsoft is still innovating around C#

Page 22: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

Java5 (2004): Added generics!

Java5 was a huge change from earlier Java No generics prior to 2004, even though it was widely requested

Lots of other features Autoboxing/unboxing of primitive types (int/Integer, etc.) Static imports Annotations (e.g., @Override)

Page 23: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

Java generic engineering

Challenge: should List<Integer> and List<String> be different? In Java 4, there was only java.util.List (i.e., List of Object) With C++ templates, the compiler specializes the code

• Benefit: performance, Cost: code-size bloat

Complication: Should the JVM have to know about generics? Many JVMs now besides Sun (e.g., IBM)

Solution: Type erasure Type parameters only exist in the source code, not the compiled code!

Page 24: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

Type erasure: good, bad, ugly

Good: we can implement singleton empty lists static <T> IList<T> makeEmpty() { @SuppressWarnings("unchecked") IList<T> typedEmptyList = (IList<T>) Empty.SINGLETON; return typedEmptyList;}

Bad: you can’t say new T() (because T isn’t there at runtime) T isn’t a “reified type” (C# supports reified types)

Ugly: you instead end up passing around Class<T> instances public class NamedMatcher<PatternT extends Enum<PatternT> & NamedMatcher.TokenPatterns> { public NamedMatcher(@NotNull Class<PatternT> enumPatternsClazz) { …

Page 25: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

Remember: we want to find bugs early! // create an array of strings String[] strings = new String[10]; // cast it to an array of objects Object[] objects = strings; // insert an object into the array objects[0] = new Object();

Similar code with generic Vectors:

// create a vector of strings Vector<String> strings = new Vector<String>(10); // cast it to a vector of objects Vector<Object> objects = (Vector<Object>)strings; // insert an object into the vector objects.add(new Object());

runtime exception

compiler error

Page 26: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

Java6 (2006)

Mostly, lots of library support

Notable new features Pluggable annotations (e.g., @Contract and @NotNull) JavaScript integration (more on this in a few weeks)

Page 27: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

Meanwhile at Yahoo!

Hadoop: Started in 2002 at Yahoo!

Now: Open source (Apache Software Foundation)

Google’s MapReduce programming model Decompose your programs into “maps” and “folds” (sound familiar?)

Java virtual machines on all nodes

Hadoop Distributed filesystem (HDFS)

Widely used across industry (newer alternatives are better…)

Page 28: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

Meanwhile, Android!

Android, Inc. started in 2003, acquired by Google in 2005 All Android apps (and half of Android itself) are written in Java.

Android doesn’t use the JVM, or any of the Java graphics libraries All built from scratch. But Sun seemed okay with this.

Page 29: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

Meanwhile, Apple!

Mac OS X (circa 2000) supported Java Apple integrated updates from Sun.

Stock Java language, custom APIs for access to Mac features

Page 30: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

2009: Oracle merges with Sun

Page 31: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

2011: Java7

Support for dynamic languages (new JVM byte code instructions) Used by a JavaScript engine that they also added

Better support for external compilers/debuggers (like IntelliJ)

Type inference (the diamond <> type)

Better support for 64-bit computers

Page 32: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

Oracle decides to “monetize” Java

Page 33: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

Oracle decides to “monetize” Java

Page 34: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

Oracle decides to “monetize” Java

Page 35: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

Legal risk? What do you do?Apple’s solution: Swift (a whole new programming language) Lessons learned from Java, C#, others Special language syntax to deal with null Compatible with existing Objective-C code

• Reference-counted memory (not GC!)

Microsoft looks pretty smart Owning their core technologies…

Page 36: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

What did Google do?Android supported Java APIs, but reimplemented the libraries and VM from scratch Oracle’s lawsuit: copyright on APIs? Outcome: Google won (so far)

Recently, Google has partial Java8 support with their own toolchain

Page 37: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

Meanwhile, Java8 (2014)First serious changes to Java since Java5 (ten years earlier!) Lambdas!

• Using the invokedynamic instruction introduced in Java7. • Labeling old interfaces “functional” allows lambdas to replace

many existing uses of anonymous inner classes. Default methods on interfaces! Type inference! Java streams (parallel execution on functional list-like things)! Better garbage collection!

Page 38: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

Challenge: adding new features for old code

Java collections classes (HashMap, ArrayList, etc.) are roughly the same as a decade earlier, used everywhere You can’t change any API without breaking everything! Third-party libraries also implement the same interfaces!

Default methods on interface are a clean and clever solution Add a new default method to the interface:

• It works on every implementation of the interface • Support new features in terms of old methods

Page 39: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

Challenge: adding functional programming1) Immutable “wrappers” around existing mutating classes (They couldn’t do what we do in Comp215 and start over.)

2) Totally new “streams” which are a lot like our IList But they can run in parallel (more on this later in the semester)

Page 40: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

Meanwhile, other JVM languages

JRuby, Jython: Run Ruby or Python code in the JVM!

Groovy: dynamically typed (your “gradle” files are Groovy programs)

Scala: functional, object-oriented, and lots more (used in Comp311)

Clojure: a LISP-family language

Kotlin: like Scala, but simpler (used internally by IntelliJ) Wallach’s temptation: teach Comp215 in Kotlin rather than Java8.

Page 41: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

And lots of other random projects

CardJava: programming for tiny not-so-smartcards Still used in some places

Java Micro Edition: flip phone programming Apple iOS and Google Android won; Oracle (and Microsoft) lost

Java TV: back to Java’s set-top box roots! Most “smart” TVs today are just a corner-to-corner web browser Java is part of the Blu-ray standard (BD-J)

Page 42: The History of Java: A case · Why is the Jar standard better? Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)

Twenty years later… How’s Java doing?Most Java code written in 1995 will still compile and run just fine today Really old C code doesn’t always work properly with modern C compilers.

Java has huge tooling and library support And for CS education, high school and college, it’s still the standard.

Java gave up on running untrusted code in the browser JavaScript now has that particular honor, and it’s got its own issues. Java’s type safety makes it more resilient to security attacks than C or C++.

Are Java’s days numbered? Twenty years of engineering decisions piled on top of one another. But new JVM languages (like Kotlin) are very pleasant to use.