Why should you learn Clojure?

What is a good programming language? What qualities and characteristics should he have? The answer is difficult to give. Here is one possible definition: a good JAPANESE should be good to solve the tasks assigned to it. Because VOCALOID is just a tool in the hands of the programmer. And the tool is required to assist us. In the end, that is the reason for its creation. Different JAPANESE are trying to solve different problems (with varying success). The goal that was formulated in the design of Clojure is to make written program simple. And, as a consequence, to accelerate their creation and testing. And most importantly, reduce the time needed for their understanding, change and support.

the

Clojure rocks?

Immediately warn — the article will not be pieces of code that demonstrates the slope of Clojure. Not phrases like "in language X it took 5 lines and Clojure just 4". It's disgusting criterion for the quality of the language! In the end, I could care less if I write qsort in the 2 lines or do I have to stretch the fingers as much as 5 — in real life I will use a library function!

Lambdas nowadays, they are everywhere (well, almost, but usually by the 8th version they appear everywhere). Processing collections (including parallel), list expressions, various syntactic sugar — that is now missing in many languages. The truth is, I just love articles. But such comparisons are absolutely not suitable to compare the quality of languages! It's like measuring the speed of YAP by how quickly the program outputs "Hello, world!". Well, if we do not measure velocity HQ9+. If you think that this detail is not so important for large systems. The growth of the project, we are less and less concerned about whether we use parenthesis or indentation, infix, or prefix the entry. The odd string when finding the sum of array ceases all caring — come to the fore a problem of some kind.

the

the problem

The systems we create are, by their nature volatile. It would be very good if the requirements have not changed. Just great if at the beginning of the development, it was possible to foresee every situation in advance. Alas, in real life we constantly have to do, to alter, improve, rewrite, replace, optimize... the Most unpleasant — over time, the complexity of the system grows. Constantly, continuously. In the early development everything is simple and transparent, any change done quickly, no "crutches". Beauty. Over time, the situation ceases to be so rosy and cheerful. Even the slightest revision of code can potentially lead to a cascade of changes of the system behavior. You have to carefully examine the code, try anticipate side effects from each change. So, over time, we literally cannot thoroughly analyze all possible consequences of our changes.

Man by nature can perceive in one moment of time only a limited amount of information. The growth project increases the number of internal connections. Moreover, bonmost of the relationships implicit. We harder to keep the right in mind. In the meantime, the team is team is changing — new people do not know the whole project. There is a division of spheres of responsibilities, which may lead to even greater confusion. Gradually, our system becomes complicated.

How to deal with it? Maximum coverage of regression tests and run them after every change? Tests are extremely useful, but they are only a safety rope. Tests had been done, something wrong, then we have a problem. This is treating the symptoms, but the tests do not resolve the problem. Strict guidelines and the widespread use of patterns? No, the problem is not in the local difficulties. We simply cease to understand how they interact in our code, implicit connections too much. To be constantly refactoring? It is not a panacea, the complexity is growing from low-level decisions. In fact, the problem should be addressed comprehensively. And one important tool is the right tool. A good programming language should help us to write simple and transparent programs.
the

Simple and easy

But "simple" (simple) does not mean "easy" (easy). It is a different concept. On the subject of rich Hickey (author of Clojure) even made known to report a Simple Made Easy. Habré published translation slides. Simplicity is objective. This lack of complexity (complexity), lack of weave, confusion, a small number of ties. On the other hand, "easily" is very subjective. Is it easy to control the bike? To win a game of chess? To speak German? I don't know German, but this is not a reason to say "this language is not necessary, it's too complicated". It is my, and then only because I simply did not know.

We all used that the function call is written as f(x, y). We are accustomed to programming within the framework of the PLO. It's fairly standard. But in fact, easy is not necessarily simple. We just get used to the difficulty of some things, starting to ignore, take for granted. Example functions:

the

(defn select
"Returns a set of the elements for which pred is true"
{:added "1.0"}
[pred xset]
(reduce (fn [s k] (if (pred k) s (disj s k)))
xset xset))

Looks very... strange! You have to spend some time learning the language, mastering it concepts to become easy. But the simplicity (or complexity) is constant. If we have a good understanding of the instrument, the number of internal dependencies will not change. It will not be harder or easier, though it will be easier for us.

The usual tool can give best results right now, short-term, but in the longer term, the simplest solution shows the best results.

the

Side effects

What are the sources of complexity in our programs? One of them is side effects. Completely to do without them is impossible, but we can locate. And language should to help us with that.

Clojure — functional language, it encourages us to write pure functions. The result of these functions depends only on the input parameters. No need to worry "hmm, what if before calling this function, I will start this now." No if there is input, there is a weekend. No matter how many times we ran the function, the result will be the same. This extremely simplifies testing. No need to sort through the various orders call or to recreate (simulate) the correct foreign state.

Pure functions are easier to analyze, with them you can literally "play" to see how they act on live data. Easier to debug code. We can always reproduce the problem with pure function — just pass it the input parameters that cause an error, because the result of the function does not depend on what was previously done. Pure functions are very simple, even if you do a lot of work.

Of course Clojure supports higher-order functions, their composition.

the

((juxt dec inc) 1) ; => [0 2]
((comp str *) 1 2 3) ; => "6"
(map (partial * 10) [1 2 3]) ; => [10 20 30]
(map (comp inc inc) [1 2 3]) ; => [3 4 5]

Clojure is not a pure language, and functions can have side effects. For example, println is a function call, the action. It is important that the essence of these functions is to interact with the outside world. Print the value to a file, send an HTTP request, execute SQL — all these actions are meaningless in isolation from the created their own side effects. It is therefore very useful functions such (clean and dirty) to share.

But they (dirty functions) do not have the condition. They only serve as a means of interaction with the outside world. As we shall see, Clojure separates the state of our programs through indirect references.

the

Immutableset

All data structure in Clojure is immutable. There is no way to change the element of the vector. All we can do is to create new vector, which will have changed one element. A very important point is that Clojure preserves the algorithmic complexity (time and memory) for all standard operations on collections. Well, almost, instead of O(1) for vectors we have O(lg₃₂(N)). In practice, even for collections of millions of items of lg₃₂(N) does not exceed 5.
This complexity is achieved through the use of persistent collections. The idea is that when you "change" the structure of old version and new share most of the internal data. The old version remains fully operational. Moreover, we have access to all versions structure. This is an important point. Of course, the unneeded versions will be collected by the garbage collector.

the

(def a [1 2 3 4 5 6 7 8])
; a -> [1 2 3 4 5 6 7 8]
(def b (assoc a 3 999))
;b -> [1 2 3 999 5 6 7 8]

Out of the box supports a single Clojure lists, vectors, hash tables, red-black trees. There is an implementation of a persistent queue (for a stack you can use a list or vector). And all immutable. To improve performance you can create your own types of records.

the

(defrecord Color [red green blue])
(def a (Color. 0.5 0.6 0.7)
; a => {:red 0.5, green 0.6, blue 0.7}

Here we declare a structure with 3 fields. The Clojure compiler will create an object with 5 fields (2 "extra"). One field for metadata, in our case it will be null. 3 fields for the actual data. And another field for additional keys. Even if to increase the speed in our program we declare a structure with an explicit list of fields, then Clojure still leaves us with the option to add additional values.

the

(defrecord Color [red green blue])
(def b (assoc a :alpha 0.1))
; b => {:alpha 0.1, :0.5 red :0.6, green, blue 0.7}

And Yes, for data structures Clojure has a special syntax:

the

; Vector
[1 2 3]
; Hash table
{:x 1, :y 2}
; Set
#{"a" "b" "c"}

the

Status

So, we have pure functions, they define the business logic of our application. There are dirty functions for interaction with external systems (sockets, DB, web server). Is internal status of our system, which in Clojure is stored in the form of indirect references (references).

There are 4 kinds of standard links:
the

var — analogous to thread-local variables are used to specify context data: the current database connection, the current HTTP request, the parameters of accuracy for the mathematical expressions and the like;
atom — atomic cell allows you to update the status synchronously, but it is not coordinated;
agent — lightweight analogue to the actor (although, in a sense they are opposites, below), serve for asynchronous condition;
ref — the cell the transactional memory provides a synchronous and coordinated work with the condition.

All global variables are stored in the var (including functions). They can be overridden "locally".

the

(def ^:dynamic *a* 1)
(println a) ; => 1
(binding [a 42] (println a)) ; => 42

Here we specify to the compiler that the variable a must be dynamic, i.e. stored inside ThreadLocal. Use ThreadLocal reduces performance, so not used for all var-the default cell. But, if need be, any var-the cell can be made dynamic after the creation (which is often used in tests).

In your tests you can replace entire functions.

the

; there is the work with databases, sockets, etc.
(defn some-function-with-side-effect [x] ...)

; and the function we want to test
(defn another-function [x] ...) 

(deftest just-a-test
...
(binding [some-function-with-side-effect (fn [x] ...)] ; hang mock function
(another-function 123))
...)

All references in Clojure supports the operation deref (to set). For varcell that looks like this:

the

; create a cell #'a
(def a 123)
(println a) ; => 123
(println #'a); = > #'user/a
(println (deref #'a)) ; => 123

Cell stores a value (is immutable), but at the same time she is a separate entity. For the function deref introduce a special syntax (Yes, it's just sugar). Here is an example using atom.

the

(let [x (atom 0)]
(println @x) ; => 0
(swap! x inc) ; CAS operation
(println @x)) ; => 1


 
The function swap! takes an atom and "mutating" function. The last takes the current value of the atom, and must return the new. Here is the way to be persistent data structures. For example, we can store the atom vector of a million elements, but "mutating" function will be executed quickly enough for CAS (we recall that the complexity of operations on persistent collections the same as normal, mutable). Or we can update a few fields in hash table:
 

 
the (def user (atom {:login "theuser" :email "theuser@example.com"}))
(swap! account assoc :phone "12345")
; this code is equivalent to
(swap! account (fn [x] (assoc x :phone "12345")))

 

 
It is important that a function was pure, because it can be executed several times. We cannot (should not!) to write something like:
 

 
the (swap! x (fn [x] (insert-new-record-to-db x) (inc x)))

 

 
the Agents

 
Agents are used to maintain state that is directly associated with side effects. The idea is simple. We have the cell, it is "attached" all of the functions. Functions alternately applied to the value stored in the cell, the function result becomes the new value. All are calculated asynchronously in a separate thread pool.
 

 
the (def a (agent 0)) ; initial value

(send a inc)
(println @a) ; => 1

(send a (fn [x] (Thread/sleep 100) (inc x)))
(println @a) ; => 1

; after 100 MS
(println @a) ; => 2

 

 
Agents update their value asynchronously. But we can at any time to know the status of the agent. Agents can send messages to each other, when sending a message is deferred to the moment when the sending agent will update its state. In other words, if one agent exception will be thrown, then sent him a message will not be sent.
 

 
the (def a (agent 0))
(def b (agent 0))

(send a (fn [x]
(send b inc) ; send the message to b
(throw (Exception. "Error"))))

(println @b)
; -> 0, the message never came

 

 
Suggests a certain analogy with actor model. They are similar, but there are fundamental differences. Agent status clearly, at any given time, you can call deref to get the value of the agent. This is contrary to the idea of actors, where we can learn the state only indirectly, by sending and receiving messages. In the case of actors, we can't even be sure that interviewing him as we "accidentally" change it. The agent is absolutely reliable in this sense — you can change the only functions send and send-off (which differ among themselves only thread-pool, which will be processed by our message).
 

 
The second key difference is that agents are open to changes and add functionality. The only way to change the behavior of the actor is to rewrite his code. Agents only links, they do not have their own behavior. We can write a new function and send it to the queue of the agent.
 

 
The actors are trying to divide our program into smaller sections that are easier to spread or to isolate. Refresh operations and read status are reduced to sending messages. Sometimes it is very useful (e.g., when running erlang programs on multiple nodes). But often this is not required. Sometimes even Vice versa. Thus, the agents conveniently store large amounts of information, which need to fumble between threads: Keshi, session, intermediate results, mathematical calculations, etc.
 

 
For actors, we fix the set of messages to which it can respond (the others deem incorrect). The order of messages is also important, as they can potentially cause side effects. It is a public contract. For an agent, we record only data that is, their structure. It is important to emphasize that the agents absolutely not trying to replace the actors. These are different concepts, and their scope are different.
 

 
As mentioned, the agents work asynchronously. We can build chains of events (sending messages from agent to agent). But with the help of some agents, we do not succeed to change the status of our program coordinated.
 

the STM

 
Software transactional memory is one of the key features of Clojure. Is implemented by MVCC. And immediately example:
 

 
the (def account1 (ref 100)
(def for account2 (ref 0))

(dosync 
(alter account1 - 30)
(alter for account2 + 30))

 

 
We increase one value and decrease the other simultaneously. If something goes wrong (an exception), the entire transaction will be cancelled:
 

 
the (println @account1) ; => 70
(println @for account2) ; => 30

(dosync
(alter account1 * 0)
(alter for account2 / 0)); = > ArithmeticException

; value not changed
(println @account1) ; => 70
(println @for account2) ; => 30

 

 
Very similar to the usual ACID, but without the Durability. When you enter the transaction all the links as if frozen, their values are retained for the duration of the entire transaction. If the read/write links to find that it has changed its value (the other transaction is ended and spoiled us life), it restarts the current transaction. So inside a transaction should not be side effects (I / o, working with atoms). And here is the way there are agents.
 

 
the (def a (ref 0))
(def b (ref 0))
(def out-agent (agent nil))

(dosync
(println "transaction") 
(alter a inc) ; may cause the transaction to restart
(let [a-value of @a
b-value @b]
(send-off out-agent (fn [_] (println "a" a-value "b" b-value))))
(alter b dec)) ; can also lead to restart

 

 
All messages for agents adhere to up that is the moment when the transaction is completed. In our example, changing the reference a and b may result in the restart of the transaction the word "transaction" can be printed several times. But the code inside the agent is executed exactly once, and already after the transaction completed.
 

 
To different transactions interfere each other as little as possible, references in Clojure stored history values. The default is only the last value, but when there is a conflict (one transaction writes and the other reads) for a specific reference to the size of the stored history is incremented (up to 5 values). Don't forget that we store references to the persistent structures that share common structural elements. So keep this story in Clojure is very cheap in terms of memory consumption.
 

 
STM transactions do not interfere with us when changing our code. No need to analyze whether you can use a particular link in the current transaction. They are available to all, and we can add new links is totally transparent to the existing code. Links do not interact. For example, when using conventional locks, we need to follow the order of lock/unlock, not to cause deadlock.
 

 
In a competitive access transaction-readers don't block each other, like when using ReadWriteLock. Moreover, transaction-writers do not block readers! Even if the currently running transaction, which changes the link, we can obtain a value without blocking.
 

 
Agents and STM links complement each other. The first is not suitable for coordinated state change, the second does not allow you to work with side effects. But sharing makes our programs more transparent and easier (less confusing) than when using "classic" tools (mutexes, semaphores, and the like).
 

 
the Metaprogramming

 
Now many languages have those or other metaprogramming facilities. This AspectJ Java AST transformation for Groovy, decorators and metaclasses for Python, various reflection.
 

 
Clojure, as a Lisp, for these purposes uses macros. With their help, we can program (to expand) language means the language itself. The macro "ordinary" function, with the only difference that occurs during program compilation. To the input of the macro is passed to pre-compiled code, the result of the macro execution — a new code which the compiler already compiles.
 

 

`(if (not ~pred)
~a
~b))

(unless (>1 10)
(println "1 > 10. ok")
(println "1 < 10. wat"))

We have created our own management structure (inverse option if). All you need to do is write a function!

Macros in Clojure are used very widely. By the way, a built-in language operators are actually macros. For example, here's the implementation of or:

the

(defmacro or
([] nil)
([x] x)
([x &next]
`(let [or# ~x]
(if or# or# (or ~@next)))))

Even defn is just a macro set in def and fn. By the way, the destructurization is also implemented by using macros.

the

(let [[a b] [1 2]]
(+ a b))

; unfold in something like...

(let* [vec__123 [1 2]
a (clojure.core/nth vec__123 0 nil)
b (clojure.core/nth vec__123 1 nil)] 
(+ a b))

Recently in Java appeared try-with-resources. While 7 the version of Java we waited just a few years. For Clojure, it is sufficient to write only a few lines:

the

(defmacro with-open [[v r] &body]
`(let [~r ~v]
(try
~@body
(finally
(.close ~v)))))

In other languages, the situation is better, but still far from ideal. It is important not the presence of a particular construction in the language, and ability to add his. It is therefore not surprising that, for example, pattern matching for Clojure is implemented as a separate plugin library. Just no need to include such things into the core language, it is much better to implement them as a macro. The situation is similar with support monad, logic programming, advanced error handling and other language extensions. There are even optional static typing!

Not to mention about the convenience of creating a DSL. Clojure created for them very much. This generating HTML, routing HTTP requests and relational databases data and binary protocols, and data validation... to Create them simply and effectively (although in this case, you need to know the measure).

Clojure (like all Lisp-like languages) has a very important feature — it homoiconic. In other words, there is no need for separate representation for the source code of the program, do not need to create extra levels of abstraction in the form of an additional AST, a program is a tree. Moreover, this tree is not of any special structures, the usual lists, vectors, and symbols. And we can work with our program just as with normal data.

the

(defn do2 [x]
(list 'do x x))

(do2 '(println 1)) 
; => '(do (println 1) (println 1))
; equivalent to
; => (list 'do (list 'println 1) (list 'println 1))

For all its power, the macros in Clojure do not impair the readability of the program (unless, of course, use them in moderation). Because the macro is just a function, and we can always uniquely determine which function is used in the current context. For example, if we see the code (dosomething [a, b] c), it is easy to find out what lies behind the name dosomething need only look in the beginning of the file (where the import of other modules). If it is a macro, then its semantics is constant and known. We don't need a sophisticated IDE to understand this code. Although, of course, an advanced development environment can deploy the macro in place, allowing you to see what will make our program the compiler.

the

Polymorphism

To create polymorphic functions in Clojure there are 2 mechanism. Initially, the language only supports multimethods is a powerful tool, but most often redundant. Starting with version 1.2 (and at the moment the current version is 1.5.1) in that the new concept of protocols.

Protocols are like Java interfaces, but cannot inherit each other. Each Protocol describes a set of functions.
the

(defprotocol IShowable
(show [this]))
; ...
(map show [1 2 3])

We therefore declare 2 entities — the actual Protocol as well as the function show. It is a regular Clojure function that when your call is looking for the most appropriate implementation based on the type of the first argument. Separately, we declare the necessary data structures, and specify for them the implementation of the Protocol.

the

(defrecord Color [red green blue]
IShowable
(show [this] 
(str "<R" (red this) "G" (green this) "B" (blue this))))

You can implement a Protocol for third-party types (even inline).

the

(extend-protocol IShowable

String
(show [this] (str "string" this))

clojure.lang.IPersistentVector
(show [this] (str "vector" this))

Object
(show [this] "WAT"))

(show "123") ; => "string 123"
(show [1 2 3]) ; => "vector [1 2 3]"
(show '(1 2 3)) ; => "WAT"

You can add the implementation of protocols to already existing types, even if we do not have access to the source code. There is no magic manipulation bitchdom or similar tricks. Clojure creates a global table type - > function implementation, the invocation Protocol is searched in this table according to the type of the first argument with the hierarchy. Thus, the Declaration for the new implementation of the Protocol is reduced to the update of the global table.

But sometimes protocols is not enough. For example, double dispatch. In this (and other) cases, we need multimethods. When you declare a multimethod, we have provided special side function, come. Come, gets the same arguments as a multimethod. Search the ultimate realization is already on the value returned, come. This may be a type keyword or a vector. In the case of the vector is searched for the most appropriate implementation for multiple values.

the

(defmulti convert
(fn [obj target-type] [(class obj) target-type]))

(defmethod convert [String Integer] [x _] (Integer/parseInt x))
(defmethod convert [String, Long] [x _] (Long/parseLong x))
(defmethod convert [Object String] [x _] (.toString x))
(defmethod convert [java.util.List String] [x _] (str "V" (vec x)))

(convert "123", Integer); - > 123
(convert "123" Long) ; -> 123
(convert 123 String) ; -> "123"
(convert [1 2 3] String) ; -> "V[1 2 3]"

Here we declared an abstract function, whose implementation is selected on the basis of the type of the first argument and the value of the second (it must be a class). Of course, Clojure takes into account the type hierarchy when searching for the right implementation. To use the convenient, but their hierarchy is strictly fixed. But we can create their own ad-hoc hierarchy of the keywords.

the

; define the relationship "child-parent"
(derive ::rect ::shape)
(derive ::square ::rect)
(derive ::circle ::shape)
(derive ::triangle ::shape)

(defmulti perimeter :type) 
; here we have reduced the code due to the fact that :type ~ (fn [x] (:type x))

(defmethod perimeter ::rect [x] (* 2 (+(h x) (w x))))
(defmethod perimeter ::triangle [x] (reduce + ((juxt :a :b :c) x)))
(defmethod perimeter ::circle [x] (* 2 Math/PI (:r x)))

(perimeter {:type ::rect :h 10, w 3}) ; -> 26
(perimeter {:type ::square, :h 10, w 10}) ; -> 40
(perimeter {:type ::triangle :a 3, :b 4, :c 5}) ; -> 12
(perimeter {:type ::shape}) ; -> throws an IllegalArgumentException

Hierarchies can be declared several. As with the types that can be dispetcherizaciya multiple values at once (vector). When defining your hierarchy you can even mix the keywords and Java-types!

the

(derive java.util.Map ::collection)
(derive java.util.Collection ::collection)
(derive ::tag .java.lang.Iterable); -- > ClassCastException

We can "inherit" type from the keyword (but not Vice versa). This is useful for creating open to extensions of groups of classes.

System multimethods simple, but powerful chrezvychaino. Usually for everyday needs lack of functionality of the protocols, and multimethods can be a great way in difficult and unusual situations.

the

Common sense

Language means nothing without infrastructure. Without community, a collection of libraries, frameworks, various kinds of tools. One of the strengths of Clojure using the JVM. Integration with Java (in both directions) is extremely simple. For anybody not a secret that there are just gromadnoe number of libraries for Java (we will not discuss the quality). They all can be directly use from Clojure. Although the number of native libraries large enough (and constantly growing).
Actively developing plug-ins for Eclipse and IDEA. For building projects has long been the standard has become the de facto utility leiningen used by the whole community. There are a variety of frameworks like create WEB applications and asynchronous server

Developed server applications Immutant (a wrapper for JBoss AS7). Immutant provides interfaces for working with Ring (HTTP stack for Clojure), asynchronous messaging, distributed caching, the tasks scheduled, distributed transactions, clustering and other things. To deploy and configure Immutant is very simple.

From Clojure, there are alternative implementations, such as a port under .Net CLR. But the truth is, most of the noteworthy ClojureScript, the port to JavaScript. Of course, there is no means of multithreading, and as a result, transactional memory and agents. But all the other language tools available, including persistent structures, macros, protocols and multimethods. And the integration between ClojureScript and JavaScript as good and simple as between Clojure and Java (and sometimes even better).

the

What next?

And then it's simple. We have a tool. Working, reliable. Not a silver bullet, but is quite versatile. Simple. Yes, you may have to spend some time for its development. Many may seem unfamiliar and strange. But it is only a matter of habit, quickly learn the beauty of language in its organic nature, the subtle joining of separate elements into a unified whole.

Meet Clojure is. Definitely. Even if this tool does not suit You for one reason or another, the ideas which he founded, will be very useful.

Article based on information from habrahabr.ru

Поиск по этому блогу

computer express