Scala 1


Prgramming Paradigms

Paradigm: In science, a paradigm describes distinct concepts or thought patterns in some scientific discipline.Main programming paradigms:

  • imperative programming
  • functional programming
  • logic programming

Orthogonal to Object-oriented programming

Imperative Programming

Imperative programming is about:

  • modifying mutable varaibles
  • using assignments
  • and control structure such as if-then-else, loops, break, continue, return.

The most common informal way to understand imperative programs is as instruction sequences for a Von Neumann computer:

Processer <---Bus---> Memory

There’s a strong correspondence between

mutable variables <-> memory cells
variable dereferences <-> load instruction
variable assignments <-> store instructions
control structures <-> jumps
  • Problem
    • 扩展性问题
    • Scaling up. How can we avoid conceptualizing programs word by word ?
  • Reference
    • John Backus, Can Programming Be Liberated from the von Neumann Style?
    • Turing Award Lecture 1978. 1950s 发明Fortran,20年后意识到Fortran的局限性。

In the end, pure imperative programming is limited by the “Von Neumann” bottleneck:

One tends to concepualize data structures word-by-word.

We need other techniques for defining high-level abstractions such as collections, polynomials, geometric shapes, strings, documents,… Ideally: Develop thories of collections, shapes, strings…

What is a Theory?

A theory consists of :

  • one or more data types 数据类型
  • operations on these types 数据操作
  • laws that describe the relationships between values and operations 它们之间的关系

Normally, a theory does not describe mutations!

Theories without Mutation

For instance the theory of polynomials defines the sum of two polynomials by laws such as:

( a * x +b ) + (c * x +b ) = (a+c)* x + (b+d)

But it does not define an operator to change a coefficient while keeping the polynomial the same

但是Theory中并没有关于mutation的定义,一个多项式的系数如果用代码表示:

class Polynomial { double[] coefficient; }
Ploynomial p = ...;
p.coefficient[0] = 42;

p的系数可以从外部被修改而变化,而p却不变,这显然违背了数学公式的唯一性:当系数确定时,多项式p是唯一的,当系数变化时,产生新的多项式p。因此,数学函数是没有mutation的,但是将这个模型映射到程序中,却很难做到。

Consequences for Programming

If we want to implement high-level concepts following their mathematical theories, there’s no place for mutation.

承接上面的问题,如果我们想把数学模型映射到程序中,是不允许mutation存在的。

  • The theories do not admit it.
  • Mutation can destroy useful laws in the theories

Therefore, let’s :

  1. concentrate on defining theories for operators expressed as functions
  2. avoid mutations
  3. have powerful ways to abstract and compose functions.

Functional Programming

  • In a restricted sense, FP means programming without mutable variables, assignments, loops, and other imperative control structures.没有变量,赋值,循环,过程性控制语句

  • In a wider sense, functional programming means focusing on the functions
  • In particular, functions can be values that are produced, consumed, and composed.functions in a FP language are first-class citizens.
    • they can be defined anywhere, including inside other functions
    • like any other value,they can be passed as parameters to functions and returned as results
    • as for other values, there exists a set of operators to compose functions.

Some FP lanugages

  • In the restriced sense:
    • Pure Lisp, XSLT, XPath, XQuery, FP
    • Haskell(without I/O, Monad, or UnsafePerformIO)
  • In the wider sense:
    • Lisp, Scheme, Racket, Clojure
    • SML, OCaml, F#
    • Haskell(full language)
    • Scala
    • Smalltalk Ruby(!) 支持block的OOP

History of FP lauguages

  • 1959 Lisp
  • 1975-77 ML,FP,Scheme
  • 1978 Smalltalk
  • 1986 SML
  • 1990 Haskell, Erlang
  • 1999 XSLT
  • 2000 OCaml
  • 2003 Scala, XQuery
  • 2005 F#
  • 2007 Clojure
  • SICP: A classic. Many parts of the course and quizzes are based on it, but we change the language from Scheme to Scala.

  • Programming in Scala: The standard language introduction and reference

Why Functional Prgramming?

  • simpler reasoning principles
  • better modularity
  • good for exploiting parallelism for multicore and cloud computing.

To find out more see the video of my 2011 Oscon Java keynote:Working Hard to Keep it Simple

  • Keynote of
    • 多核对并行计算有要求,但是现代的编程语言缺乏对这方面的支持
    • Parallel Programming vs Concurrent Programming
      • 并行编程主要利用CPU多核特性,使任务处理变的快速,但是任务处理可以使有序的
      • 并发编程指的是同一时刻需要处理许多任务
      • Both too hard to get right!
    • The Root of The Problem
      • 多线程引起的不确定性
      • 不确定性 = 平行计算 + mutable state
      • 如果要保证并行计算的确定性,首先要avoid mutable state
      • Avoiding mutable state 意味着 programming functionally
    • Scala:
      • Agile, with lightweight syntax
      • OO
      • Functional
      • Safe and performant, with strong static typing
    • Different Tools for Different Purposes
      • Parallelism:
        • Collections : Parallel Collections
        • Collections : Distributed Collections
        • Parallel DSLs
      • Concurrency:
        • Actors
        • Software transactional memory
        • Futures

Elements of Programming

Every non-trivial programming language provides:

  • primitive expressions representing the simplest elements
  • ways to combine expressions
  • ways to abstract expressions, which introduce a name for an expression by which it can then be referred to

REPL

  • 进入scala命令行:scala / sbt console
  • 退出scala::quit

Evaluation

A non-primitive expression is evaluated as follows/

  1. Take the leftmost operator
  2. Evaluate its operands (left before right)
  3. Apply the operator to the operands

A name is evaluated by replacing it with the right hand side of its definiation The evaluation process stops once it results in a value

scala> def square(x:Double) = x*x
square: (x: Double)Double

scala> def sumOfSquares(x:Double,y:Double)  = square(x)+square(y)
sumOfSquares: (x: Double, y: Double)Double

Parameter and Return Types

Function parameters come with their type, which is given after a colon

def power(x: Double, y:Int) :Double = 

If a return type is given, it follows the parameter list.Primitive types are as in Java, but are written capitalized

  1. Int : 32-bit integers
  2. Double : 64-bit floating pointer numbers
  3. Boolean boolean values true and false

Evaluation of Function Applications

  • Evaluate 函数参数,从左到右
  • 将函数按照等号右边展开
  • 将函数内部的参数替换成传入的参数
  • Example:

      sumOfSquare(3,2+2)
      sumOfSquare(3,4)
      square(3) + square(4)
      3*3 + square(4)
      9 + 4*4
      9 + 16
      25
    

The substitution model

  • This scheme of expression evaluation is called substitution model (这种规则被称为替换模型)

  • the idea underlying this model is that all evaluation does is reduce an expression to a value. (思想是所有对表达式的求值都会reduce出一个value,类似于代数求值,也可以理解为所有表达式都会最终有一个输出)

  • It can be applied to all expressions, as long as they have no side effects.(他可以被应用到所有的表达式中,但是表达式中不允许出现side-effect:不符合替换模型的表达式,如 a++;)

  • The substitution model is formalized in the lamda-calculus, which gives a foudation for function programming.(这种替换模型被标准化为lamda表达式,奠定了FP的基础。)

Termination

Does every expression reduce to a value (in a finite number of steps)?No. Here is a counter-example:

def loop: Int = loop
loop;

Chaning the evaluation strategy

The interpreter reduces function arguments to values before rewriting the function application.

One could alternatively apply the function to unreduced arguments.Example:

sumOfSquare(3,2+2)
square(3) + square(2+2)
3*3 + square(2+2)
9 + (2+2)(2+2)
9 + 4*4
25

Scala的解释器在处理函数的参数时,一种方式是将参数先进行求值后代入,另一种方式是lazy的形式,参数一直带入到最后再求值。

  1. 前一种称为 call-by-value , 它的优势是对入参表达式只进行一次求值
  2. 后一种称为 call-by-name , 它的优势是如果参数在函数中不会使用,那么就不会进行求值运算

Evaluation Strategies and Termination

Scala中通常使用CBV,但是如果参数用=>修饰则代表该参数用CBN求值,Example:

def constOne(x:Int, y: => Int) = 1

Conditionals and Value Definiation

Conditional Expressions

To express choosing between two alternatives, Scala使用条件表达式:

if - else

它和Java中的if-else很像,但是在imperative programming中,if-else称为statement,函数型语言中没有statement,只有expression,[而根据Dan的课程], Expression要具备三点:

  1. syntax
  2. type checking rules
  3. evaluation rules

具体可以参考ML中关于if-else的描述,Example:

def abs(x: Int) = if ( x >= 0 ) x else -x

Value Definitions

上面已经看到,函数的参数可以通过CBV和CBN两种方式求值,同样的规则也适用于定义函数。

  • def 是CBN,等号右边在函数被调用的时候再进行求值

  • val 是CBV,等号右边在函数定义的时候进行求值

val x = 2
val y = square(x)

Afterwards, the name refers to the value.

For instance, y above refers to 4, not square(2)

Value Definitions and Termination

valdef的区别当等号右边的表达式为死循环时,便可以体现的很清楚:

def loop : Boolean = loop
  • A definition:
def x = loop

没问题,等号右边不会求值。

  • Value:
val x = loop

死循环,等号右边定义的时候进行了求值。

对于Recursive的函数,函数的返回值需要显式声明,否则返回值是optional的。

Blocks and Lexical Scope

Nested functions

一般来说,一个任务可以分解为多个函数来实现,但有些函数只是helper,并只会被调用一次,它不需要暴露出来,这时候需要考虑使用block:

def cal(x:Double) = 
{
	def func1(x:Double):Double = x+1
	def func2(x:Double):Double = x*x
	func1(x) + func2(x)
}

Blocks in Scala

  • block用{…}表示
{
	val x = f(3)
	x*x
}
  • 它里面包含一系列definition 或者 expression
  • The last element of a block is an expression that defines its value(最后一个值为block的返回值)
  • This return expression can be preceded by auxiliary definitions
  • Blocks are themselves expressions; a block may appear everywhere an expression can. (意思是block是first class的)

Blocks and Visibility

val x = 0

def f(y:Int) = y+1

val result = {

	val x = f(3)
	x*x

}
  • The definitions inside a block are only visible from within the block.
  • the definitions inside a block shadow definitions of the same names outside the block

Lexical Scoping

Definitions of outer blocks are visible inside a block unless they are shadowed.

Therefore, we can simplify sqrt by eliminating redundant occurrences of the x parameter, which means everywhere the same thing.

Semicolons

Scala中分号是optional的,但是多个表达式在一行时,分号是不能省的:

val y = x+1; y*y

Summary

  1. arithmetic and boolean expressions
  2. conditional expressions if-else
  3. functions with recursion
  4. nesting and lexical scope
  5. CBN and CBV

Resource