함수형 프로그래밍이란? (1편 부작용)

함수형 프로그래밍이란? (1편 부작용)

Humor 2016. 1. 17. 16:17

함수형 프로그래밍이란 무엇인가?
What Is Functional Programming?

작성일: 2015년 12월 29일
29 Dec 2015

이 글에서 나는 함수형 프로그래밍이 정말로 무엇인지 설명하려고 한다. 일을 제때 끝내야 하는 월급 개발자들 입장에서 수긍할 수 있게 말이다.
This is my take on what functional programming really is, in a way that will make sense to a jobbing programmer just trying to Get Stuff Done.

다음을 잘 생각해보자. 여러분이 작성하는 모든 함수는 두 종류의 입력과 출력을 가진다.
I put it to you that every function you write has two sets of inputs and two sets of outputs.

두 종류라니? 한 종류 뿐이지 않나?
Two? Only one, surely?

아니, 둘이다. 그것도 확실히 두 종류다. 다음 예제에서 첫번째 종류의 입출력을 살펴보자.
No, two. Definitely two. Let's take a look at the first pair with this example:

public int square(int x) {
return x * x;
}
// NOTE: 어떤 언어인가는 중요하지 않다. 여기서는
// NOTE: The language doesn't matter, but I've picked one with
// 강조하기 위해 입출력 타입이 명확한 언어를 골랐을 뿐이다.
// explicit input & output types, for emphasis.

여기서 여러분은 int x를 입력으로, int를 출력값의 타입으로 생각할 것이다.
Here, the input you're used to thinking about is int x, and the output you're used to is also an int.

그것이 바로 입력 및 출력의 첫 번째 종류이다. 그냥 일반적인 입출력이라고 불러도 좋다. 이제 입력과 출력의 두 번째 종류를 보여주는 예를 보자.
That's the first set of inputs & outputs. The traditional set, if you will. Now let's see an example of the second set of inputs and outputs:

public void processNext() {
Message message = InboxQueue.popMessage();

if (message != null) {
process(message);
}
}

문법적으로 따지자면, 이 함수는 입력이 없고 어떤 값도 반환하지 않지만, 분명히 무언가 의존성을 가지며, 또 뭔가 하는 일이 있다는 것은 분명하다. 사실은 이 함수가 숨겨진 형태의 입출력을 가진다는 의미이다. 숨겨진 입력은 popMessage()를 호출하기 전의 InboxQueue 상태이고, 숨겨진 출력은 process 호출로 인해 발생하는 모든 것과 모든 일이 끝나고 났을 때의 InboxQueue 상태이다.
According to the syntax, this function takes no inputs and returns no output, and yet it's obviously depending on something, and it's obviously doing something. The fact is, it has a hidden set of inputs and outputs. The hidden input is the state of the InboxQueue before the popMessage() call, and the hidden outputs are whatever process causes, plus the state of InboxQueue after we're done.

실수하지 말자, 분명 InboxQueue의 상태는 이 함수의 입력이다. 그 값을 모르고서는 processNext가 어떻게 동작할지 알 수 없다. 그리고 그것은 진짜 출력이기도 하다. InboxQueue의 바뀐 상태를 고려하지 않고서는 processNext를 호출한 결과를 완전히 이해할 수 없다.
Make no mistake - the state of InboxQueue is a genuine input of this function. The behaviour of processNext cannot be known without knowing that value. And it's a genuine output too - the result of calling processNext cannot be fully understood without considering the new state of InboxQueue.

그래서 두 번째 코드 조각에는 숨겨진 입력과 출력이 있다. 무언가를 필요로하고, 또 변경을 초래하기도 하지만 API만 봐서는 추측할 수 없을 것이다.
So the second piece of code has hidden inputs and outputs. It requires things, and causes things, but you could never guess what just by looking at the API.

이 숨겨진 입력과 출력은 공식적인 이름을 가지고 있다. 바로 “부작용(side-effect)”이다. 이 부작용에는 여러가지 종류가 있지만 모두 같은 컨셉으로 아우를 수 있다. “우리가 이 함수를 호출하려면 인수 목록에는 없지만 필요한 것들이 무엇이고, 반환 값에 반영되지 않으면서 하는 일은 무엇인가?”
These hidden inputs and outputs have an official name: "side-effects". There are many kinds of side-effects, but they all come together under the same concept: "when we call this function, what does it need that isn't in the argument list, and what does it do that isn't part of the return value?"

(사실 나는 용어를 구분해야 한다고 본다. 숨겨진 출력은 “부작용(side-effect)”으로, 숨겨진 입력은 “부원인(side-cause)”으로 말이다. 이 글에서는 간결함을 위해 “부작용”이란 말을 사용하겠지만 분명 부원인도 의미하는 것이다. 나는 모든 숨겨진 입력과 출력에 대해 이야기하고 있다.)
(Actually I think we need two terms: "side-effects" for the hidden outputs, and "side-causes" for the hidden inputs. For most of the rest of this post I'll use "side-effects" for brevity, but I'm definitely talking about side-causes too. I'm talking about all hidden inputs and outputs.)

부작용은 복잡성 빙산이다
Side-Effects are the Complexity Iceberg

함수가 부작용(과 부원인)을 가진다면, 여러분은 다음의 함수를 보면…
When functions have side-effects (and side-causes), you can look at a function like this:

public boolean processMessage(Channel channel) {...}

… 이 함수가 어떤 일을 할지 안다고 생각하겠지만 그것은 완전히 틀렸다. 함수 내부를 보지 않고 무엇을 필요로 하는지 무슨 일을 하는지 전혀 알 길이 없다. 채널(Channel)에서 메시지를 꺼내어 처리하는 걸까? 아마도. 어떤 조건 하에서는 채널을 닫아버리나? 그럴지도 모른다. 어디 다른 데이터베이스의 특정 카운트를 업데이트하나? 어쩌면. 로깅 디렉토리 경로를 찾을 수 없는 경우에 죽어버리는 건 아닐까? 그럴 수도 있다.
…and think you've got an idea of what it's doing, and be totally wrong. There's no way to know what it requires or what it will do without looking inside. Does it take a message off the channel and process it? Probably. Does it close your channel if some condition is true? Maybe. Does it update a count in the database somewhere? Perhaps. Does it explode if it can't find the logging directory path it was expecting? It might.

부작용은 복잡성 빙산이다. 여러분은 함수의 Signature와 이름을 보면, 그 함수가 무언인지 알 수 있다고 생각한다. 그러나 함수 Signature의 표면 아래 숨겨진 것은 그 무엇이든 될 수 있다. 어떤 요구 사항이든 숨겨져 있을 수 있고, 또 어떤 숨겨진 변경도 발생할 수 있다. 그 구현을 보지 않고, 여러분은 정말 어떤 것들이 연관되어 있을지 전혀 알 수 없다. API의 표면 아래에는 잠재적으로 엄청나게 큰 복잡성이 숨어 있다. 여러분이 함수를 제대로 파악하려고 할 때 가능한 대안은 세 가지가 있다. 함수 정의를 파고 들거나, 복잡성을 표면 위로 드러내거나, 그냥 무시하고 잘 되길 바라는 것이다. 하지만 결국에는 무시하는 것이 엄청난 실수가 된다.
Side-effects are the complexity iceberg. You look at the function signature, and the name, and think you've got a sense of what you're looking at. But hidden beneath the surface of the function signature could be absolutely anything. Any hidden requirement, any hidden change. Without looking at the implementation, you've no way of knowing what's really involved. Beneath the surface of the API is a potentially vast block of extra complexity. To grasp it, you'll only have three options: dive down into the function definition, bring the complexity to the surface, or ignore it and hope for the best. And in the end, ignoring it is usually a titanic mistake.

그래서 캡슐화를 하는 것 아닌가?
Isn't This What Encapsulation's About?

아니.
No.

캡슐화는 구현 세부 사항을 숨기는 것에 관한 것이다. 코드의 내부를 숨겨서 호출하는 쪽에서 걱정할 필요가 없게 하는 것이다. 좋은 원칙이기는 하지만 지금 우리가 이야기하는 것과는 다른 이야기다.
Encapsulation is about hiding implementation details. About hiding the innards of the code so the caller doesn't need to worry about them. That remains a good principle, but it's not what we're talking about.

부작용은 “구현 세부 사항 숨기기”에 관한 것이 아니다. 코드와 외부 세상과의 관계를 숨기는 것에 대한 것이다. 부원인을 가지는 함수는 그것이 의존하고 있는 외부 요인에 대한 문서화되지 않은 가정을 가진다. 부작용을 가지는 함수는 그것이 바꾸게 될 외부 요인에 대한 문서화되지 않은 가정을 가진다.
Side-effects aren't about "hiding implementation details" - they're about hiding the code's relationship with the outside world. A function with side-causes has undocumented assumptions about what external factors it's depending on. A function with side-effects has undocumented assumptions about what external factors it's going to change.

부작용이 나쁜가?
Are Side-Effects Bad?

부작용이 원래 작성한 프로그래머가 예상한 그대로 정확하게 동작한다면 괜찮을 것이다. 그러나 문제가 있다. 우리는 원래 프로그래머의 숨겨진 예상이 정확하다고, 그리고 시간이 지나도 여전히 정확할 것이라고 신뢰해야만 한다.
When they work exactly as the original programmer expected, no, they're probably fine. But there's the rub: we have to trust that the the hidden expectations of the original programmer were correct, and will remain correct as time marches on.

이 함수가 작성될 때 기대했던 것과 똑같이 세상의 상태를 셋업했는가? 혹시 어딘가를 바꾸지는 않았던가? 아마 겉으로 봐서는 전혀 연관없어 보이는 코드 조각을 수정했을지 모른다. 아니면 새로운 환경에 소프트웨어를 설치하고 있기 때문일지도 모른다. 세계의 상태에 대한 숨겨진 가정이 있다는 것은 충분히 비슷하니 잘 동작할 것이라는 우리의 숨겨진 희망을 의미한다.
Have we set up the state of the world the way this function expected when it was written? Or did the world get changed somewhere? Perhaps because a seemingly-unconnected piece of code changed. Or because we're installing the software in a new environment. Hidden assumptions about the state of the world mean we have hidden hopes that it's similar enough to work.

이러한 코드를 테스트 할 수 있나? 이런 코드는 완전히 분리하여 테스트 할 수 없다. 회로 기판처럼 입력을 연결하고 출력만 확인할 수 있는게 아니다. 우리는 코드를 열어보고 숨겨진 원인과 결과를 파악하고, 세상을 그럴듯하게 시뮬레이션해야 한다. 나는 TDD 개발자들이 블랙 박스로 테스트할지 화이트 박스로 테스트할지 헷갈려하는 경우를 여러번 봤다. 그 대답은, 블랙 박스 테스트를 해야 한다이다. 여러분은 구현 세부 사항을 무시 할 수 있어야 한다. 하지만 여러분이 부작용을 허용하게 되면 블랙 박스 테스트를 할 수 없다. 부작용은 블랙 박스 테스트 여지를 없애버린다. 박스를 열고 그 안에 무엇이 들어 있는지 확인하지 않고서는 입력과 출력을 결정할 수 없기 때문이다.
Can we test this code? Not in isolation. Unlike a circuit board, we can't just plug into its inputs and check its outputs. We have to break open the code, figure out its hidden causes and effects, and simulate the world it's supposed to exist in. I've seen several TDD'ers spin in circles about whether they should do black box or white box testing. The answer is, you ought to do black box testing - you ought to be able to ignore the implementation details - but if you allow side-effects, you can't. Side-effects close the door to black box testing, because you can't get to the inputs & outputs without cracking the box open and learning what's inside.

이 효과는 디버깅 시에 증폭된다. 함수가 부작용 (또는 부원인)을 허용하지 않는 경우, 당신은 단지 몇 가지 입력에 대해 출력을 확인하여 올바른지 여부를 알 수 있다. 그러나 부작용이 있는 함수라면? 여러분이 시스템의 다른 부분을 어디까지 고려해야 할지 그 끝을 알 수 없다. 함수가 무엇에든 의존할 수 있고 무엇이든 변경할 수 있다면 버그는 어느 곳에든 있을 수 있다.
This effect is amplified for debugging. If a function doesn't allow side-effects (or side-causes), you can understand whether it's correct just by giving it some inputs and checking the outputs. But a function with side-effects? There's no upper-limit to how many other parts of the system you'll have to consider. When it's allowed to depend on anything, and cause anything, then the bugs could be anywhere.

우리는 항상 부작용을 표면으로 드러낼 수 있다
We Can Always Surface Side-Effects

이러한 복잡성에 대해 우리가 할 수 있는 일이 있을까? 있다. 사실 시작하기는 매우 간단하다. 함수가 어떤 입력을 가진다면 그렇게 말하면 된다. 출력으로 뭔가를 반환한다면 그렇게 선언하면 된다. 그렇게 단순하다.
Can we do anything about this complexity? Yes. It's actually pretty simple to get started: If a function has something as an input, just say so. If it returns something as an output, declare it. Simple as that.

예제로 직접 해보자. 아래 함수는 숨겨진 입력을 가진다. 당신이 빨리 찾을 수 있다면 보너스 점수를 주겠다.
Let's try an example. Here's a function with a hidden input. Bonus points if you spot it quickly:

public Program getCurrentProgram(TVGuide guide, int channel) {
Schedule schedule = guide.getSchedule(channel);

Program current = schedule.programAt(new Date());

return current;
}

이 함수는 현재 시간(new Date())을 숨겨진 입력으로 가진다. 우리는 이 추가 입력을 정직하게 대함으로서 복잡성을 표면화할 수 있다.
This function has a hidden input of the current time (new Date()). We can surface this complexity by just being honest about this extra input:

public Program getProgramAt(TVGuide guide, int channel, Date when) {
Schedule schedule = guide.getSchedule(channel);

Program program = schedule.programAt(when);

return program;
}

이 함수는 이제 숨겨진 입력이나 출력이 없다.
This function now has no hidden inputs (or outputs).

이 새로운 버전의 장단점을 살펴 보자.
Let's look at the pros and cons of this new version:

단점
Cons

더 복잡해 보인다. 두 개가 아닌 세 개의 인자를 가진다.
It looks more complex. It has three arguments instead of two.

장점
Pros

더 복잡하지 않다. 의존성을 숨긴다고 더 간단해지지는 않는다. 의존성을 정직하게 드러낸다고 더 복잡해지지는 않는다.
It isn't more complex. Hiding a dependency didn't make it simpler, being honest about it doesn't make it more complex.

훨씬 테스트하기 쉽다. 하루 중 어느 때든, 시차 변경이나 윤년을 테스트 하는 경우에도 원하는 시간을 넘겨주기만 하면 되므로 모두 간단하다. 나는 첫번째 버전의 코드가 실제 제품에 사용된 것을 본적이 있는데, 테스트를 위해 시스템 시간을 바꾸느라 별의별 트릭을 동원해야 했다. 인자로 바꿀 수만 있다면 필요한 노력이 얼마나 될지 상상해보라.
It's vastly easier to test. Testing different times of day, clock changes, leap years, will all be straightforward, because we can pass in any time we like. I've seen code like the first version in production, with all sorts of clever tricks to spoof the current system clock for testing's sake. Imagine the effort, when we can just make it a parameter!

추론하기 더 쉽다. 이 함수는 단지 입력과 출력의 관계를 기술하고 있을뿐이다. 여러분이 입력을 알고 있다면 결과가 무엇이어야 하는지 모든 것을 알고 있다. 이것은 정말 대단한 것이다. 우리는 이 코드를 따로 떼어내어 확인할 수 있다. 입력과 출력 사이의 관계만 테스트하면 함수 전체를 테스트한 것이 된다.
It's easier to reason about: This function now just describes a relationship between its inputs and its outputs. If you know the inputs, you know what the result should be, and you know everything about the result. This is a big deal. We can verify this code in isolation. As long as we've tested the relationship between inputs and outputs, we've tested the whole of the function.

(게다가 부가적으로 더 유용한 함수가 되었다. “한 시간 뒤에 시작하는 프로그램이 무엇인가?”를 구하는 코드도 덤으로 얻었다.)
(And as an aside, it's also more useful. We get, "what program starts in an hour?" code for free.)

‘순수 함수’는 무엇인가?
What is a 'Pure Function'?

두구두구두구…
Drumroll please.

이제 숨겨진 입력과 출력을 알게 되었으니 “월급 개발자의 순수 함수 정의”를 알려줄 수 있겠다.
Now, finally, with an awareness of hidden inputs & outputs, we can give "a jobbing programmer's definition of pure functions":

모든 입력이 입력으로 선언되고 (숨겨진 것이 없어야 한다) 마찬가지로 모든 출력이 출력으로 선언된 함수를 ‘순수(pure)’하다고 부른다.
A function is called 'pure' if all its inputs are declared as inputs - none of them are hidden - and likewise all its outputs are declared as outputs.

반대로 숨겨진 입력이나 출력이 있는 경우는 순수하지 않은 것이며, 함수가 제공한다고 보이는 계약(contract)이 사실은 전체의 절반을 이야기해 줄 뿐이다. 복잡성 빙산이 나타난다. 순수하지 않은 코드를 ‘독립적으로’ 사용할 수는 없다. 독립적으로 테스트 할 수 없다. 테스트하거나 디버그가 필요할 때면 그것이 의존하고 있는 것을 항상 신경써야 한다.
In contrast, if it has hidden inputs or outputs, it's 'impure', and the contract we think the function offers is only half the story. The iceberg of complexity looms. We can never use impure code "in isolation". We can never test it in isolation. It always depends on other things which we have to keep track of whenever we want to test or debug.

‘함수형 프로그래밍’이란 무엇인가?
What is 'Functional Programming'?

순수/비순수 함수를 알게 되었으니 이제 여러분에게 “월급 개발자의 함수형 프로그래밍 정의”를 알려주겠다.
With an awareness of pure and impure functions, we can now give a "jobbing programmer's" definition of functional programming":

함수형 프로그래밍은 순수 함수를 작성하는 것, 그러니까 숨겨진 입력이나 출력을 최대한 제거하여 가능한한 우리 코드의 대부분이 단지 입력과 출력의 관계를 기술하게끔 하는 것을 말한다.
Functional programming is about writing pure functions, about removing hidden inputs and outputs as far as we can, so that as much of our code as possible just describes a relationship between inputs and outputs.

부작용을 완전히 피할 수는 없다. 대부분의 프로그램은 반환 값을 얻기 위해서가 아니라 어떤 동작을 하기 위해 실행하기 때문이다. 하지만 프로그램 내부에서는 엄격하게 통제하고자 한다. 우리는 가능한 모든 곳에서 부작용(과 부원인)을 제거하고, 또 제거할 수 없는 경우에는 철저하게 통제할 것이다.
We accept that some side-effects are inevitable - most programs are run for what they do rather than what they return, but within our program we will exercise tight control. We will eliminate side-effects (and side-causes) wherever we can, and tightly control them whenever we can't.

다르게 말하자면, 코드 조각이 필요로 하는 것과 유발하게 될 결과를 숨기지 말자. 코드 조각이 제대로 실행하기 위해 뭔가를 필요로 한다면 그렇게 말하자. 뭔가 유용한 일을 한다면 출력 형태로 선언하자. 이렇게 한다면 우리의 코드는 더 명확해 질 것이다. 복잡성이 표면에 드러나고 우리는 그것을 분해하여 처리할 수 있을 것이다.
Or put another way: Let's not hide what a piece of code needs, nor what results it will yield. If a piece of code needs something to run correctly, let it say so. If it does something useful, let it declare it as an output. When we do this, our code will be clearer. Complexity will come to the surface, where we can break it down and deal with it.

‘함수형 프로그래밍 언어’는 무엇인가?
What is a 'Functional Programming Language'?

모든 언어는 순수 함수를 지원한다. add(x, y)를 순수하지 않게 만들기는 어렵다.(1) 그리고 많은 경우 순수하지 않은 함수를 순수하게 만들때 필요한 일은 모든 입력 및 출력을 함수 Signature에 올리는 것 뿐이다.그럼 모든 프로그래밍 언어가 ‘함수형’인가?
Every language supports pure functions - it's hard to make add(x, y) impure1. And in many cases converting an impure function to a pure one is just a case of lifting all its inputs and outputs into the function signature, so that the signature totally describes its behaviour. So are all programming languages 'functional'?

아니다. 만약 그렇다면 굳이 용어를 둘 필요도 없을 것이다.
No. Because then the term would be meaningless.

그럼 “월급 개발자의 함수형 프로그래밍 언어 정의”가 무엇일까?
So what can we give as a "jobbing programmer's definition of a functional programming language"?

함수형 프로그래밍 언어는 부작용없는 프로그래밍을 지원하고 장려하는 언어이다.
A functional programming language is one that supports and encourages programming without side-effects.

더 구체적으로 말하자면, 함수형 언어는 여러분이 가능한한 부작용을 제거하고 그렇지 않은 곳에는 철저히 제어 할 수 있도록 적극적으로 도와주는 언어이다.
Or more specifically: A functional language actively helps you eliminate side-effects wherever possible, and tightly control them wherever it's not.

더 극적으로 표현하자면, 함수형 언어는 더 적극적이고 더 격렬하게 부작용에 적대적인 언어이다. 부작용은 복잡성이고, 복잡성은 버그이며, 버그는 악마이다. 함수형 언어는 여러분들도 부작용에 적대적이 되도록 도와줄 것이다. 여러분과 함께 그들(부작용,복잡성,버그)을 깨부시고 굴복시킬 것이다.
Or more dramatically: A functional language is actively hostile to side-effects. Side-effects are complexity and complexity is bugs and bugs are the devil. A functional language will help you be hostile to side-effects too. Together you will beat them into submission.

그게 다야?
Is That It?

그렇다. 여러분이 숨겨진 입력이라고는 도저히 생각하지도 못했을 두어가지 미묘한 것들이 있기는 하지만 그것이 본질이다. 그러나 “부작용이 첫 번째 적이다”라는 관점으로 소프트웨어를 개발하기 시작한다면 여러분이 프로그래밍에 대해 알고 있던 모든 것이 달라진다. 이 글의 2부에서는 부작용과 함수형 프로그래밍에 대한 인식을 바탕으로, 프로그래밍이라는 땅 위에 샷건을 쏘아볼 예정이다.
Yes. There are a couple of subtleties - things you probably never thought of as a hidden input before, but that's the essence. But start building software with the perspective of "side-effects are the first enemy" and it will change everything you know about programming. Join me for part two, in which we take an awareness of side-effects, and functional programming, and fire a scattergun over the programming landscape.

감사의 글
Acknowledgments

이 포스팅은 함수형 프로그래밍의 본질에 대해 나눴던 논의들에서 출발한 것이다. 특히 “적절한 라이브러리의 도움이 있다면 자바 스크립트를 함수형 프로그래밍 언어로 간주할 수 있는가”에 관한 Sleepyfox과의 대화가 주효했다. 나의 본능적 대답은 ‘아니다’였지만 왜를 거듭 고민한 끝에 유용한 사고 연결 고리를 따라갈 수 있었다.
This post comes out of a couple of discussions about the nature of functional programming. Particularly a chat with Sleepyfox discussing whether JavaScript could be considered a functional programming language, with the right libraries. My answer was an instinctive no, but thinking through why lead me along a very fruitful chain of thought.

제임스 헨더슨에게도 도움을 받았다. 나는 그와 함께 올 한해동안 함수형 프로그래밍과 관련한 여러가지 유익한 아이디어들을 주고 받았다.
Hat tip to James Henderson, which whom I have bounced around many fruitful functional ideas this year.

말콤 스파크 , 이반 Uemlianin , 조엘 클레르몽 , 케이티 모에, 그리고 내 이름과 발음이 같은 도플 갱어 크리스(Chris) 젠킨스의 교정과 제안에도 감사한다.
And thanks to Malcolm Sparks, Ivan Uemlianin, Joel Clermont, Katy Moe and my homophonic-doppleganger Chris Jenkins for proofreading & suggestions.

Footnotes:

1
자바는 정말 열심히 노력하지만…
Although Java tries really hard.

참조 사이트:
https://medium.com/@jooyunghan/%ED%95%A8%EC%88%98%ED%98%95-%ED%94%84%EB%A1%9C%EA%B7%B8%EB%9E%98%EB%B0%8D%EC%9D%B4%EB%9E%80-%EB%AC%B4%EC%97%87%EC%9D%B8%EA%B0%80-fab4e960d263#.lmzvbxqy4
http://blog.jenkster.com/2015/12/what-is-functional-programming.html
관련글 관련글 더보기

ABOUT ME

컴퓨터 프로그래밍 컴퓨터 프로그래밍

티스토리툴바

ABOUT ME

관련글 관련글 더보기

티스토리툴바