Scala 1

Posted on 2013-12-03

Prgramming Paradigms

Paradigm: In science, a paradigm describes distinct concepts or thought patterns in some scientific discipline.Main programming paradigms:

imperative programming
functional programming
logic programming

Orthogonal to Object-oriented programming

Imperative Programming

Imperative programming is about:

modifying mutable varaibles
using assignments
and control structure such as if-then-else, loops, break, continue, return.

The most common informal way to understand imperative programs is as instruction sequences for a Von Neumann computer：

Processer <---Bus---> Memory

There’s a strong correspondence between

mutable variables <-> memory cells
variable dereferences <-> load instruction
variable assignments <-> store instructions
control structures <-> jumps

Problem
- 扩展性问题
- Scaling up. How can we avoid conceptualizing programs word by word ?
Reference
- John Backus, Can Programming Be Liberated from the von Neumann Style?
- Turing Award Lecture 1978. 1950s 发明Fortran，20年后意识到Fortran的局限性。

In the end, pure imperative programming is limited by the “Von Neumann” bottleneck:

One tends to concepualize data structures word-by-word.

We need other techniques for defining high-level abstractions such as collections, polynomials, geometric shapes, strings, documents,… Ideally: Develop thories of collections, shapes, strings…

What is a Theory?

A theory consists of :

one or more data types 数据类型
operations on these types 数据操作
laws that describe the relationships between values and operations 它们之间的关系

Normally, a theory does not describe mutations!

Theories without Mutation

For instance the theory of polynomials defines the sum of two polynomials by laws such as:

( a * x +b ) + (c * x +b ) = (a+c)* x + (b+d)

But it does not define an operator to change a coefficient while keeping the polynomial the same

但是Theory中并没有关于mutation的定义，一个多项式的系数如果用代码表示:

class Polynomial { double[] coefficient; }
Ploynomial p = ...;
p.coefficient[0] = 42;

p的系数可以从外部被修改而变化，而p却不变，这显然违背了数学公式的唯一性：当系数确定时，多项式p是唯一的，当系数变化时，产生新的多项式p。因此，数学函数是没有mutation的，但是将这个模型映射到程序中，却很难做到。

Consequences for Programming

If we want to implement high-level concepts following their mathematical theories, there’s no place for mutation.

承接上面的问题，如果我们想把数学模型映射到程序中，是不允许mutation存在的。

The theories do not admit it.
Mutation can destroy useful laws in the theories

Therefore, let’s :

concentrate on defining theories for operators expressed as functions
avoid mutations
have powerful ways to abstract and compose functions.

Functional Programming

In a restricted sense, FP means programming without mutable variables, assignments, loops, and other imperative control structures.没有变量，赋值，循环，过程性控制语句
In a wider sense, functional programming means focusing on the functions
In particular, functions can be values that are produced, consumed, and composed.functions in a FP language are first-class citizens.
- they can be defined anywhere, including inside other functions
- like any other value,they can be passed as parameters to functions and returned as results
- as for other values, there exists a set of operators to compose functions.

Some FP lanugages

In the restriced sense:
- Pure Lisp, XSLT, XPath, XQuery, FP
- Haskell(without I/O, Monad, or UnsafePerformIO)
In the wider sense:
- Lisp, Scheme, Racket, Clojure
- SML, OCaml, F#
- Haskell(full language)
- Scala
- Smalltalk Ruby(!) 支持block的OOP

History of FP lauguages

1959 Lisp
1975-77 ML,FP,Scheme
1978 Smalltalk
1986 SML
1990 Haskell, Erlang
1999 XSLT
2000 OCaml
2003 Scala, XQuery
2005 F#
2007 Clojure

Recommended Book

SICP: A classic. Many parts of the course and quizzes are based on it, but we change the language from Scheme to Scala.
Programming in Scala: The standard language introduction and reference

Why Functional Prgramming?

simpler reasoning principles
better modularity
good for exploiting parallelism for multicore and cloud computing.

To find out more see the video of my 2011 Oscon Java keynote:Working Hard to Keep it Simple

Keynote of
- 多核对并行计算有要求，但是现代的编程语言缺乏对这方面的支持
- Parallel Programming vs Concurrent Programming
  - 并行编程主要利用CPU多核特性，使任务处理变的快速，但是任务处理可以使有序的
  - 并发编程指的是同一时刻需要处理许多任务
  - Both too hard to get right!
- The Root of The Problem
  - 多线程引起的不确定性
  - 不确定性 = 平行计算 + mutable state
  - 如果要保证并行计算的确定性，首先要avoid mutable state
  - Avoiding mutable state 意味着 programming functionally
- Scala:
  - Agile, with lightweight syntax
  - OO
  - Functional
  - Safe and performant, with strong static typing
- Different Tools for Different Purposes
  - Parallelism:
    - Collections : Parallel Collections
    - Collections : Distributed Collections
    - Parallel DSLs
  - Concurrency:
    - Actors
    - Software transactional memory
    - Futures

Elements of Programming

Every non-trivial programming language provides:

primitive expressions representing the simplest elements
ways to combine expressions
ways to abstract expressions, which introduce a name for an expression by which it can then be referred to

REPL

进入scala命令行:scala / sbt console
退出scala::quit

Evaluation

A non-primitive expression is evaluated as follows/

Take the leftmost operator
Evaluate its operands (left before right)
Apply the operator to the operands

A name is evaluated by replacing it with the right hand side of its definiation The evaluation process stops once it results in a value

scala> def square(x:Double) = x*x
square: (x: Double)Double

scala> def sumOfSquares(x:Double,y:Double)  = square(x)+square(y)
sumOfSquares: (x: Double, y: Double)Double

Parameter and Return Types

Function parameters come with their type, which is given after a colon

def power(x: Double, y:Int) :Double =

If a return type is given, it follows the parameter list.Primitive types are as in Java, but are written capitalized

Int : 32-bit integers
Double : 64-bit floating pointer numbers
Boolean boolean values true and false

Evaluation of Function Applications

Evaluate 函数参数，从左到右
将函数按照等号右边展开
将函数内部的参数替换成传入的参数

Example:

  sumOfSquare(3,2+2)
  sumOfSquare(3,4)
  square(3) + square(4)
  3*3 + square(4)
  9 + 4*4
  9 + 16
  25

The substitution model

This scheme of expression evaluation is called substitution model （这种规则被称为替换模型）
the idea underlying this model is that all evaluation does is reduce an expression to a value. （思想是所有对表达式的求值都会reduce出一个value，类似于代数求值，也可以理解为所有表达式都会最终有一个输出）
It can be applied to all expressions, as long as they have no side effects.（他可以被应用到所有的表达式中，但是表达式中不允许出现side-effect：不符合替换模型的表达式，如 a++;）
The substitution model is formalized in the lamda-calculus, which gives a foudation for function programming.（这种替换模型被标准化为lamda表达式，奠定了FP的基础。）

Termination

Does every expression reduce to a value (in a finite number of steps)?No. Here is a counter-example:

def loop: Int = loop
loop;

Chaning the evaluation strategy

The interpreter reduces function arguments to values before rewriting the function application.

One could alternatively apply the function to unreduced arguments.Example:

sumOfSquare(3,2+2)
square(3) + square(2+2)
3*3 + square(2+2)
9 + (2+2)(2+2)
9 + 4*4
25

Scala的解释器在处理函数的参数时，一种方式是将参数先进行求值后代入，另一种方式是lazy的形式，参数一直带入到最后再求值。

前一种称为 call-by-value ，它的优势是对入参表达式只进行一次求值
后一种称为 call-by-name ，它的优势是如果参数在函数中不会使用，那么就不会进行求值运算

Evaluation Strategies and Termination

Scala中通常使用CBV，但是如果参数用=>修饰则代表该参数用CBN求值,Example:

def constOne(x:Int, y: => Int) = 1

Conditionals and Value Definiation

Conditional Expressions

To express choosing between two alternatives, Scala使用条件表达式：

if - else

它和Java中的if-else很像，但是在imperative programming中，if-else称为statement，函数型语言中没有statement，只有expression，[而根据Dan的课程], Expression要具备三点：

syntax
type checking rules
evaluation rules

具体可以参考ML中关于if-else的描述，Example:

def abs(x: Int) = if ( x >= 0 ) x else -x

Value Definitions

上面已经看到，函数的参数可以通过CBV和CBN两种方式求值，同样的规则也适用于定义函数。

def 是CBN，等号右边在函数被调用的时候再进行求值
val 是CBV，等号右边在函数定义的时候进行求值

val x = 2
val y = square(x)

Afterwards, the name refers to the value.

For instance, y above refers to 4, not square(2)

Value Definitions and Termination

val和def的区别当等号右边的表达式为死循环时，便可以体现的很清楚：

def loop : Boolean = loop

A definition:

def x = loop

没问题，等号右边不会求值。

Value:

val x = loop

死循环，等号右边定义的时候进行了求值。

对于Recursive的函数，函数的返回值需要显式声明，否则返回值是optional的。

Blocks and Lexical Scope

Nested functions

一般来说，一个任务可以分解为多个函数来实现，但有些函数只是helper，并只会被调用一次，它不需要暴露出来，这时候需要考虑使用block：

def cal(x:Double) = 
{
	def func1(x:Double):Double = x+1
	def func2(x:Double):Double = x*x
	func1(x) + func2(x)
}

Blocks in Scala

block用{…}表示

{
	val x = f(3)
	x*x
}

它里面包含一系列definition 或者 expression
The last element of a block is an expression that defines its value(最后一个值为block的返回值)
This return expression can be preceded by auxiliary definitions
Blocks are themselves expressions; a block may appear everywhere an expression can. (意思是block是first class的)

Blocks and Visibility

val x = 0

def f(y:Int) = y+1

val result = {

	val x = f(3)
	x*x

}

The definitions inside a block are only visible from within the block.
the definitions inside a block shadow definitions of the same names outside the block

Lexical Scoping

Definitions of outer blocks are visible inside a block unless they are shadowed.

Therefore, we can simplify sqrt by eliminating redundant occurrences of the x parameter, which means everywhere the same thing.

Semicolons

Scala中分号是optional的，但是多个表达式在一行时，分号是不能省的:

val y = x+1; y*y

Summary

arithmetic and boolean expressions
conditional expressions if-else
functions with recursion
nesting and lexical scope
CBN and CBV

Resource

Programming in Scala