UFO ET IT

F # 대 OCaml : 스택 오버플로

ufoet 2020. 12. 1. 20:09
반응형

F # 대 OCaml : 스택 오버플로


나는 최근 에 파이썬 프로그래머를위한 F #에 대한 프레젠테이션을 찾았고 그것을 본 후 "개미 퍼즐"에 대한 해결책을 직접 구현하기로 결정했습니다.

평면 격자 위를 걸을 수있는 개미가 있습니다. 개미는 한 번에 한 칸씩 왼쪽, 오른쪽, 위, 아래로 이동할 수 있습니다. 즉, (x, y) 셀에서 개미는 (x + 1, y), (x-1, y), (x, y + 1) 및 (x, y-1) 셀로 이동할 수 있습니다. x 및 y 좌표의 자릿수 합계가 25보다 큰 점은 개미가 액세스 할 수 없습니다. 예를 들어, 포인트 (59,79)는 25보다 큰 5 + 9 + 7 + 9 = 30이기 때문에 액세스 할 수 없습니다. 질문은 다음과 같습니다. (1000, 1000)에서 시작하면 개미가 액세스 할 수있는 포인트 수, (1000, 1000) 자체 포함?

먼저 30 줄의 OCaml에 솔루션을 구현 하고 사용해 보았습니다.

$ ocamlopt -unsafe -rectypes -inline 1000 -o puzzle ant.ml
$ time ./puzzle
Points: 148848

real    0m0.143s
user    0m0.127s
sys     0m0.013s

깔끔하게, 내 결과는 D 및 C ++에서 leonardo의 구현 과 동일 합니다. leonardo의 C ++ 구현과 비교할 때 OCaml 버전은 C ++보다 약 2 배 느리게 실행됩니다. leonardo가 재귀를 제거하기 위해 큐를 사용했음을 감안할 때 괜찮습니다.

그런 다음 코드를 F #으로 번역했는데 다음 과 같습니다.

Thanassis@HOME /g/Tmp/ant.fsharp
$ /g/Program\ Files/FSharp-2.0.0.0/bin/fsc.exe ant.fs
Microsoft (R) F# 2.0 Compiler build 2.0.0.0
Copyright (c) Microsoft Corporation. All Rights Reserved.

Thanassis@HOME /g/Tmp/ant.fsharp
$ ./ant.exe

Process is terminated due to StackOverflowException.
Quit

Thanassis@HOME /g/Tmp/ant.fsharp
$ /g/Program\ Files/Microsoft\ F#/v4.0/Fsc.exe ant.fs
Microsoft (R) F# 2.0 Compiler build 4.0.30319.1
Copyright (c) Microsoft Corporation. All Rights Reserved.

Thanassis@HOME /g/Tmp/ant.fsharp
$ ./ant.exe

Process is terminated due to StackOverflowException

스택 오버플로 ... 내 컴퓨터에있는 두 버전의 F #을 사용하여 ... 호기심으로 생성 된 바이너리 (ant.exe)를 가져와 Arch Linux / Mono에서 실행합니다.

$ mono -V | head -1
Mono JIT compiler version 2.10.5 (tarball Fri Sep  9 06:34:36 UTC 2011)

$ time mono ./ant.exe
Points: 148848

real    1m24.298s
user    0m0.567s
sys     0m0.027s

놀랍게도 Mono 2.10.5 (즉, 스택 오버플로 없음)에서 실행되지만 84 초가 걸립니다. 즉 OCaml보다 587 배 느립니다.

그래서이 프로그램은 ...

  • OCaml에서 잘 실행됩니다.
  • .NET / F #에서는 전혀 작동하지 않습니다.
  • 작동하지만 Mono / F #에서는 매우 느립니다.

왜?

편집 : 이상 함이 계속됩니다- "--optimize + --checked-"를 사용하면 문제가 사라지 지만 ArchLinux / Mono에서만 가능합니다 . Windows XP 및 Windows 7/64 비트에서는 최적화 된 버전의 바이너리 스택도 오버플로됩니다.

최종 편집 : 직접 답을 찾았습니다-아래를 참조하십시오.


요약 :

  • 꼬리 재귀 적이 지 않은 간단한 알고리즘 구현을 썼습니다.
  • Linux에서 OCaml로 컴파일했습니다.
  • 잘 작동했고 0.14 초 만에 끝났습니다.

그런 다음 F #으로 이식 할 시간입니다.

  • 코드 (직접 번역)를 F #으로 번역했습니다.
  • Windows에서 컴파일하고 실행합니다. 스택 오버플로가 발생했습니다.
  • Linux에서 바이너리를 가져와 Mono에서 실행했습니다.
  • 작동했지만 매우 느리게 실행됩니다 (84 초).

그런 다음 Stack Overflow에 게시했지만 일부 사람들은 질문을 종료하기로 결정했습니다 (한숨).

  • I tried compiling with --optimize+ --checked-
  • The binary still stack overflowed under Windows...
  • ...but run fine (and finished in 0.5 seconds) under Linux/Mono.

It was time to check the stack size: Under Windows, another SO post pointed out that it is set by default to 1MB. Under Linux, "uname -s" and a compilation of a test program clearly showed that it is 8MB.

This explained why the program worked under Linux and not under Windows (the program used more than 1MB of stack). It didn't explain why the optimized version run so much better under Mono than the non-optimized one: 0.5 seconds vs 84 seconds (even though the --optimize+ appears to be set by default, see comment by Keith with "Expert F#" extract). Probably has to do with the garbage collector of Mono, which was somehow driven to extremes by the 1st version.

The difference between Linux/OCaml and Linux/Mono/F# execution times (0.14 vs 0.5) is because of the simple way I measured it: "time ./binary ..." measures the startup time as well, which is significant for Mono/.NET (well, significant for this simple little problem).

Anyway, to solve this once and for all, I wrote a tail-recursive version - where the recursive call at the end of the function is transformed into a loop (and hence, no stack usage is necessary - at least in theory).

The new version run fine under Windows as well, and finished in 0.5 seconds.

So, moral of the story:

  • Beware of your stack usage, especially if you use lots of it and run under Windows. Use EDITBIN with the /STACK option to set your binaries to larger stack sizes, or better yet, write your code in a manner that doesn't depend on using too much stack.
  • OCaml may be better at tail-recursion elimination than F# - or it's garbage collector is doing a better job at this particular problem.
  • Don't despair about ...rude people closing your Stack Overflow questions, good people will counteract them in the end - if the questions are really good :-)

P.S. Some additional input from Dr. Jon Harrop:

...you were just lucky that OCaml didn't overflow as well. You already identified that actual stack sizes vary between platforms. Another facet of the same issue is that different language implementations eat stack space at different rates and have different performance characteristics in the presence of deep stacks. OCaml, Mono and .NET all use different data representations and GC algorithms that impact these results... (a) OCaml uses tagged integers to distinguish pointers, giving compact stack frames, and will traverse everything on the stack looking for pointers. The tagging essentially conveys just enough information for the OCaml run time to be able to traverse the heap (b) Mono treats words on the stack conservatively as pointers: if, as a pointer, a word would point into a heap-allocated block then that block is considered to be reachable. (c) I do not know .NET's algorithm but I wouldn't be surprised if it ate stack space faster and still traversed every word on the stack (it certainly suffers pathological performance from the GC if an unrelated thread has a deep stack!)... Moreover, your use of heap-allocated tuples means you'll be filling the nursery generation (e.g. gen0) quickly and, therefore, causing the GC to traverse those deep stacks often...


Let me try to summarize the answer.

There are 3 points to be made:

  • problem: stack overflow happens on a recursive function
  • it happens only under windows: on linux, for the proble size examined, it works
  • same (or similar) code in OCaml works
  • optimize+ compiler flag, for the proble size examined, works

It is very common that a Stack Overflow exception is the result of a recursive vall. If the call is in tail position, the compiler may recognize it and apply tailcall optimization, therefore the recursive call(s) will not take up stack space. Tailcall optimization may happen in F#, in the CRL, or in both:

CLR tail optimization1

F# recursion (more general) 2

F# tail calls 3

The correct explanation for "fails on windows, not in linux" is, as other said, the default reserved stack space on the two OS. Or better, the reserved stack space used by the compilers under the two OSes. By default, VC++ reserves only 1MB of stack space. The CLR is (likely) compiled with VC++, so it has this limitation. Reserved stack space can be increased at compile time, but I'm not sure if it can be modified on compiled executables.

EDIT: turns out that it can be done (see this blog post http://www.bluebytesoftware.com/blog/2006/07/04/ModifyingStackReserveAndCommitSizesOnExistingBinaries.aspx) I would not reccomend it, but in extreme situations at least it is possible.

OCaml version may work because it was run under Linux. However, it would be interesting to test also the OCaml version under Windows. I know that the OCaml compiler is more aggressive at tail-call optimization than F#.. could it even extract a tail-recusable function from your original code?

My guess about "--optimize+" is that it will still cause the code to recur, hence it will still fail under Windows, but will mitigate the problem by making the executable run faster.

Finally, the definitive solution is to use tail recursion (by rewriting the code or by realying on aggressive compiler optimization); it is a good way to avoid stack overflow problem with recursive functions.

참고URL : https://stackoverflow.com/questions/7538584/f-vs-ocaml-stack-overflow

반응형