Scala 1
Prgramming Paradigms
Paradigm: In science, a paradigm describes distinct concepts or thought patterns in some scientific discipline.Main programming paradigms:
- imperative programming
- functional programming
- logic programming
Orthogonal to Object-oriented programming
Imperative Programming
Imperative programming is about:
- modifying mutable varaibles
- using assignments
- and control structure such as if-then-else, loops, break, continue, return.
The most common informal way to understand imperative programs is as instruction sequences for a Von Neumann computer:
Processer <---Bus---> Memory
There’s a strong correspondence between
mutable variables <-> memory cells
variable dereferences <-> load instruction
variable assignments <-> store instructions
control structures <-> jumps
- Problem
- 扩展性问题
- Scaling up. How can we avoid conceptualizing programs word by word ?
- Reference
- John Backus, Can Programming Be Liberated from the von Neumann Style?
- Turing Award Lecture 1978. 1950s 发明Fortran,20年后意识到Fortran的局限性。
In the end, pure imperative programming is limited by the “Von Neumann” bottleneck:
One tends to concepualize data structures word-by-word.
We need other techniques for defining high-level abstractions such as collections, polynomials, geometric shapes, strings, documents,… Ideally: Develop thories of collections, shapes, strings…
What is a Theory?
A theory consists of :
- one or more data types 数据类型
- operations on these types 数据操作
- laws that describe the relationships between values and operations 它们之间的关系
Normally, a theory does not describe mutations!
Theories without Mutation
For instance the theory of polynomials defines the sum of two polynomials by laws such as:
( a * x +b ) + (c * x +b ) = (a+c)* x + (b+d)
But it does not define an operator to change a coefficient while keeping the polynomial the same
但是Theory中并没有关于mutation的定义,一个多项式的系数如果用代码表示:
class Polynomial { double[] coefficient; }
Ploynomial p = ...;
p.coefficient[0] = 42;
p的系数可以从外部被修改而变化,而p却不变,这显然违背了数学公式的唯一性:当系数确定时,多项式p是唯一的,当系数变化时,产生新的多项式p。因此,数学函数是没有mutation的,但是将这个模型映射到程序中,却很难做到。
Consequences for Programming
If we want to implement high-level concepts following their mathematical theories, there’s no place for mutation.
承接上面的问题,如果我们想把数学模型映射到程序中,是不允许mutation存在的。
- The theories do not admit it.
- Mutation can destroy useful laws in the theories
Therefore, let’s :
- concentrate on defining theories for operators expressed as functions
- avoid mutations
- have powerful ways to abstract and compose functions.
Functional Programming
-
In a restricted sense, FP means programming without mutable variables, assignments, loops, and other imperative control structures.没有变量,赋值,循环,过程性控制语句
- In a wider sense, functional programming means focusing on the functions
- In particular, functions can be values that are produced, consumed, and composed.functions in a FP language are first-class citizens.
- they can be defined anywhere, including inside other functions
- like any other value,they can be passed as parameters to functions and returned as results
- as for other values, there exists a set of operators to compose functions.
Some FP lanugages
- In the restriced sense:
- Pure Lisp, XSLT, XPath, XQuery, FP
- Haskell(without I/O, Monad, or UnsafePerformIO)
- In the wider sense:
- Lisp, Scheme, Racket, Clojure
- SML, OCaml, F#
- Haskell(full language)
- Scala
- Smalltalk Ruby(!) 支持block的OOP
History of FP lauguages
- 1959 Lisp
- 1975-77 ML,FP,Scheme
- 1978 Smalltalk
- 1986 SML
- 1990 Haskell, Erlang
- 1999 XSLT
- 2000 OCaml
- 2003 Scala, XQuery
- 2005 F#
- 2007 Clojure
Recommended Book
-
SICP: A classic. Many parts of the course and quizzes are based on it, but we change the language from Scheme to Scala.
-
Programming in Scala: The standard language introduction and reference
Why Functional Prgramming?
- simpler reasoning principles
- better modularity
- good for exploiting parallelism for multicore and cloud computing.
To find out more see the video of my 2011 Oscon Java keynote:Working Hard to Keep it Simple
- Keynote of
- 多核对并行计算有要求,但是现代的编程语言缺乏对这方面的支持
- Parallel Programming vs Concurrent Programming
- 并行编程主要利用CPU多核特性,使任务处理变的快速,但是任务处理可以使有序的
- 并发编程指的是同一时刻需要处理许多任务
- Both too hard to get right!
- The Root of The Problem
- 多线程引起的不确定性
- 不确定性 = 平行计算 + mutable state
- 如果要保证并行计算的确定性,首先要avoid mutable state
- Avoiding mutable state 意味着 programming functionally
- Scala:
- Agile, with lightweight syntax
- OO
- Functional
- Safe and performant, with strong static typing
- Different Tools for Different Purposes
- Parallelism:
- Collections : Parallel Collections
- Collections : Distributed Collections
- Parallel DSLs
- Concurrency:
- Actors
- Software transactional memory
- Futures
- Parallelism:
Elements of Programming
Every non-trivial programming language provides:
- primitive expressions representing the simplest elements
- ways to combine expressions
- ways to abstract expressions, which introduce a name for an expression by which it can then be referred to
REPL
- 进入scala命令行:
scala
/sbt console
- 退出scala:
:quit
Evaluation
A non-primitive expression is evaluated as follows/
- Take the leftmost operator
- Evaluate its operands (left before right)
- Apply the operator to the operands
A name is evaluated by replacing it with the right hand side of its definiation The evaluation process stops once it results in a value
scala> def square(x:Double) = x*x
square: (x: Double)Double
scala> def sumOfSquares(x:Double,y:Double) = square(x)+square(y)
sumOfSquares: (x: Double, y: Double)Double
Parameter and Return Types
Function parameters come with their type, which is given after a colon
def power(x: Double, y:Int) :Double =
If a return type is given, it follows the parameter list.Primitive types are as in Java, but are written capitalized
Int
: 32-bit integersDouble
: 64-bit floating pointer numbersBoolean
boolean valuestrue
andfalse
Evaluation of Function Applications
- Evaluate 函数参数,从左到右
- 将函数按照等号右边展开
- 将函数内部的参数替换成传入的参数
-
Example:
sumOfSquare(3,2+2) sumOfSquare(3,4) square(3) + square(4) 3*3 + square(4) 9 + 4*4 9 + 16 25
The substitution model
-
This scheme of expression evaluation is called substitution model (这种规则被称为替换模型)
-
the idea underlying this model is that all evaluation does is reduce an expression to a value. (思想是所有对表达式的求值都会reduce出一个value,类似于代数求值,也可以理解为所有表达式都会最终有一个输出)
-
It can be applied to all expressions, as long as they have no side effects.(他可以被应用到所有的表达式中,但是表达式中不允许出现side-effect:不符合替换模型的表达式,如 a++;)
-
The substitution model is formalized in the lamda-calculus, which gives a foudation for function programming.(这种替换模型被标准化为lamda表达式,奠定了FP的基础。)
Termination
Does every expression reduce to a value (in a finite number of steps)?No. Here is a counter-example:
def loop: Int = loop
loop;
Chaning the evaluation strategy
The interpreter reduces function arguments to values before rewriting the function application.
One could alternatively apply the function to unreduced arguments.Example:
sumOfSquare(3,2+2)
square(3) + square(2+2)
3*3 + square(2+2)
9 + (2+2)(2+2)
9 + 4*4
25
Scala的解释器在处理函数的参数时,一种方式是将参数先进行求值后代入,另一种方式是lazy的形式,参数一直带入到最后再求值。
- 前一种称为 call-by-value , 它的优势是对入参表达式只进行一次求值
- 后一种称为 call-by-name , 它的优势是如果参数在函数中不会使用,那么就不会进行求值运算
Evaluation Strategies and Termination
Scala中通常使用CBV,但是如果参数用=>
修饰则代表该参数用CBN求值,Example:
def constOne(x:Int, y: => Int) = 1
Conditionals and Value Definiation
Conditional Expressions
To express choosing between two alternatives, Scala使用条件表达式:
if - else
它和Java中的if-else
很像,但是在imperative programming中,if-else
称为statement,函数型语言中没有statement,只有expression,[而根据Dan的课程], Expression要具备三点:
- syntax
- type checking rules
- evaluation rules
具体可以参考ML中关于if-else
的描述,Example:
def abs(x: Int) = if ( x >= 0 ) x else -x
Value Definitions
上面已经看到,函数的参数可以通过CBV和CBN两种方式求值,同样的规则也适用于定义函数。
-
def
是CBN,等号右边在函数被调用的时候再进行求值 -
val
是CBV,等号右边在函数定义的时候进行求值
val x = 2
val y = square(x)
Afterwards, the name refers to the value.
For instance, y above refers to 4, not square(2)
Value Definitions and Termination
val
和def
的区别当等号右边的表达式为死循环时,便可以体现的很清楚:
def loop : Boolean = loop
- A definition:
def x = loop
没问题,等号右边不会求值。
- Value:
val x = loop
死循环,等号右边定义的时候进行了求值。
对于Recursive的函数,函数的返回值需要显式声明,否则返回值是optional的。
Blocks and Lexical Scope
Nested functions
一般来说,一个任务可以分解为多个函数来实现,但有些函数只是helper,并只会被调用一次,它不需要暴露出来,这时候需要考虑使用block:
def cal(x:Double) =
{
def func1(x:Double):Double = x+1
def func2(x:Double):Double = x*x
func1(x) + func2(x)
}
Blocks in Scala
- block用{…}表示
{
val x = f(3)
x*x
}
- 它里面包含一系列definition 或者 expression
- The last element of a block is an expression that defines its value(最后一个值为block的返回值)
- This return expression can be preceded by auxiliary definitions
- Blocks are themselves expressions; a block may appear everywhere an expression can. (意思是block是first class的)
Blocks and Visibility
val x = 0
def f(y:Int) = y+1
val result = {
val x = f(3)
x*x
}
- The definitions inside a block are only visible from within the block.
- the definitions inside a block shadow definitions of the same names outside the block
Lexical Scoping
Definitions of outer blocks are visible inside a block unless they are shadowed.
Therefore, we can simplify sqrt
by eliminating redundant occurrences of the x parameter, which means everywhere the same thing.
Semicolons
Scala中分号是optional的,但是多个表达式在一行时,分号是不能省的:
val y = x+1; y*y
Summary
- arithmetic and boolean expressions
- conditional expressions if-else
- functions with recursion
- nesting and lexical scope
- CBN and CBV