South Africa Flag logo

South African Skeptics

August 25, 2019, 11:01:57 AM
Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
Go to mobile page.
News: Please read the forum rules before posting.
   
   Skeptic Forum Board Index   Help Forum Rules Search GoogleTagged Login Register Chat Blogroll  
Pages: [1] 2 3 4   Go Down
  Print  
Author Topic:

On programming

 (Read 8494 times)
0 Members and 1 Guest are viewing this topic.
BoogieMonster
NP complete
Hero Member
*****

Skeptical ability: +19/-1
Offline Offline

Posts: 3078



« on: May 31, 2011, 12:16:43 PM »

While we're digressing (will try to limit the extent... ps. I failed)...

Quote from: Mefiante
As for programming, the (almost) lost art of Assembler is frowned upon as no longer relevant from many quarters.

More money for me! You cannot truly understand how a computer operates without having that assembly language->machine code->bare metal thing click in your mind. Only people who have written at least SOME assembly, and have seen a CPU diagram or two.. understand what a CPU is, and hence "how" a computer works. I agree this is sadly becoming a lost piece of understanding.

Quote
When writing fast, robust, numerically intensive solution engines for scientific or engineering applications, one can of course do this in a high-level language but the programmer has a significant advantage if s/he knows what the compiled code looks like at the CPU’s level. Compilers often blindly add bits of library code that the programmer may not even be aware of and that are unnecessary for the code in question.

Indeed but as I think you imply, I'd still write it in good C++. The difference these days between that and raw assembly are negligible in all but the most extreme cases. Your I/O operations cost a lot more than CPU cycles, which can almost be seen as irrelevant. If you can do cache or memory optimisation.... then yes, but from what I've read mere mortals are below understanding a modern cache system well enough to make it better with userland code, and usually memory access will be governed by a (relatively) expensive call into your OS anyway, which may just decide to swap your highly-hand-optimised memory access to hard-disk, unlucky. In 99.9999999% of cases, and even some cases people think they can do better, the compiler will be better. That part of the "forget assembly" argument I buy.

But then a (good) C/C++ compiler for a PIC microchip costs a buttload of money, out of the range of a hobbyist. So there I write assembly. But even on a 20Mhz PIC your code gets executed so freaking fast (no OS, no task-switching, no nothing, your code line by line at 20mhz, is actually pretty amazing) I've not yet found a situation where a PIC isn't sitting idle most of the time. - Hence they tend to build all kinds of idle-switching, power save modes, etc... into them, even if your code has to run once every 10ms the chip can still go to sleep, save some power, and wake up again in time to do it's job.

Quote
Also, CPU-specific optimisations, like effective multi-pipelining of concurrent instruction streams or instruction set extensions, are often lacking even from the best compilers.

It's (a very unfortunate) practicality thing... usually people will compile to target i686 or even earlier architectures to ensure backwards compatibility. Very seldom do you see someone custom-roll a bleeeding-edge compile for the latest-and-greatest architectural advances. Perhaps more common in the sciences, but not very common in consumer software. AFAIK Intel's compiler is the best when it comes to stuff like this (only logical), but I haven't had the need to investigate this a lot.

Quote
  It comes down to an understanding of what the code does at the CPU level, which understanding can help in eliminating a host of problems and inefficiencies before they occur.

A nice example of a bad problem is memory alignment issues. If a C++ developer doesn't understand what the compiler is going to do with certain data structures, he's gonna have code that works on his machine (probably by fluke), and crash badly on another platform, and he may have no idea why.
Logged
Mefiante
Defollyant Iconoclast
Hero Member
*****

Skeptical ability: +61/-9
Offline Offline

Posts: 3749


In solidarity with rwenzori: Κοπρος φανεται


WWW
« Reply #1 on: May 31, 2011, 14:02:15 PM »

Permit me to derail a little more.  (Maybe spawning a new thread is in order… Grin )

Much of what you say is true for most everyday applications.  And of course there are the platform-independence and code maintenance issues.

You claim that IO ops are costly.  This is true if your IO source/target is non-volatile storage.  But many specialised scientific/engineering problems are amenable to being memory managed in such a way that misalignment wait-states, fragmented memory blocks and virtual memory access penalties are minimised or even eliminated altogether.  This requires a detailed familiarity with the hardware architecture and the innards of the OS you’re coding for.

While on the subject of “good C++,” are you aware of how awfully expensive the object destructors and, especially, the constructors are?  I assume you make regular use of structured exception handling too.  Do you know what exception handling costs?  These can be huge obstacles in badly-written numerical code.

There are certain types of specialised scientific/engineering problems whose numerical treatments are intensively repetitive, e.g. finite element, finite difference and optimisation problems.  (For reference, you might like to consider that our Weather Bureau’s daily forecast run took between four and five hours on a Cray-2.)   Often, one has a small set of core functions, each of which is called hundreds of millions or even billions of times during a solution run.  These functions may themselves be iterative.  Saving even a small percentage of the clock cycles these functions require can result in a significant saving in execution time.  One admittedly extreme example resulted in a reduction from over two hours down to less than three minutes (and, as a bonus, the reworked solution engine was much less prone to pathological crashes/exceptions), but halving or even quartering the run time is not uncommon.  Thus, you need to understand the nature of the problem you’re dealing with as well as the platform you’re going to solve it on.

I’m not saying that you have to write your entire program in highly optimised code.  That would be a waste of much effort.  However, where the nature of the problem warrants it, you should write the critical parts in carefully optimised code such as Assembler, and then link the assembled object code into the rest of your program using the high-level-language development environment (some of which allow you to make use of inline Assembler code).  Experience has shown that the extra effort is worth it but it takes a practised eye to gauge this properly.

'Luthon64
Logged
Faerie
Hero Member
*****

Skeptical ability: +10/-2
Offline Offline

Posts: 2112



« Reply #2 on: May 31, 2011, 14:04:36 PM »

so, ummm.... How many programmers are on this forum???  Undecided
Logged
Mefiante
Defollyant Iconoclast
Hero Member
*****

Skeptical ability: +61/-9
Offline Offline

Posts: 3749


In solidarity with rwenzori: Κοπρος φανεται


WWW
« Reply #3 on: May 31, 2011, 14:09:17 PM »

Professionally, I don’t program anymore although I do write task-specific codes when there is no easy way to use existing tools.  I used to write such scientific and engineering codes in the manner I described earlier, though.

'Luthon64
Logged
Mandarb
Sr. Member
****

Skeptical ability: +3/-0
Offline Offline

Posts: 258



« Reply #4 on: May 31, 2011, 15:18:44 PM »

I am, at the Joburg SITP meetings it's like 80% people are in IT.
I'm not at that level that BM and Mefiante are, only ever written seriously in high level languages (mainly .Net). I can recognize Assembler and C++ when I see it, but might have some trouble reading it.
Logged
Mefiante
Defollyant Iconoclast
Hero Member
*****

Skeptical ability: +61/-9
Offline Offline

Posts: 3749


In solidarity with rwenzori: Κοπρος φανεται


WWW
« Reply #5 on: May 31, 2011, 15:39:54 PM »

.Net usually “compiles” to MSIL, not native (x86) code.  MSIL is a bit like Java’s byte code in that it requires a platform-specific runtime environment.

'Luthon64
Logged
BoogieMonster
NP complete
Hero Member
*****

Skeptical ability: +19/-1
Offline Offline

Posts: 3078



« Reply #6 on: May 31, 2011, 16:44:58 PM »

Permit me to derail a little more.  (Maybe spawning a new thread is in order… Grin )

 Grin

Quote
You claim that IO ops are costly.  This is true if your IO source/target is non-volatile storage.

Maybe not on a Cray but it's common to have PC's or servers (and those sometimes stick around for a while) where the memory is 1/2 the clock speed of the CPU or less. Things are improving though.

Quote
But many specialised scientific/engineering problems are amenable to being memory managed in such a way that misalignment wait-states, fragmented memory blocks and virtual memory access penalties are minimised or even eliminated altogether.  This requires a detailed familiarity with the hardware architecture and the innards of the OS you’re coding for.

Yeah I think this is where the crux of this discussion comes. Hand rolling assembly has application in specialised fields like this where you know your hardware and OS intimately, I completely agree. But for the "general purpouse guy" like me, even for server software, it's usually about optimising DB access, network access, and the like. Not so much about cycles. So we're just coming from 2 different applications. And my point is a bit that, the compiler writers cater to the application guys more, in my opinion.

Quote
While on the subject of “good C++,” are you aware of how awfully expensive the object destructors and, especially, the constructors are?  I assume you make regular use of structured exception handling too.  Do you know what exception handling costs?

Yes, and it depends. Constructor writers have to be prudent to use initialisation lists, and avoid assignments to mitigate costs. Usually you want your constructor body to be empty, the compiler can do initialisation lists quite efficiently. If constructors are very slow it's probably because there's got more stuff in the object than is needed, leading to objects being constructed needlessly, hinting maybe at a bad design decision. One could also be writing code that is creating too many temporaries, resulting in unwanted construction/destruction overhead... the liberal use of references can save one a good whack of unneeded temporaries.

Since we're now completely off topic:
Quote
Constructor(std::string a)
{
    this->a = a;
}

is way worse than writing:

Quote
Constructor(const std::string& a): a(a)
{ }

And a lot of people don't realise that. Yes I haven't completely removed constructors, but you can remove a lot of them with some diligence. At the end of the day the memory would need to be initialized one way or the other, but it's nice to only do it once. (Unless, as I suspect, in your domain you may have wanted to forego it because you know exactly how it's going to be used, and program very carefully). I (actually we) don't use exceptions regularly. We try to constrain those to TRULY "exceptional" cases (Network errors, OS errors...), for general error handling we avoid them, not only for performance, but certain guarantees are impossible to enforce when you allow exceptions.

BUT that is only what the language gives you. It's possible to write a block memory allocator in pure C++ that performs. It's a tricky task to make the compiler do a lot of the heavy lifting, usually involving "bending" the template language.... but I know a person who did this, don't know much about it though.

Oh and on destruction... this is a trade-off. It sounds like you were working with large areas of contiguous memory that could be nicely block-allocated and managed. But in our problem space what is needed is often unpredictable up front. So you have the option: Destruction, or Garbage collection. I prefer destruction because the de-allocation, or disconnect/cleanup code is going to happen anyway, albeit in a cascading fashion. I'd suspect though, that clean nested destructors devoid of unnecessary(not de-allocation) code could be optimised into a single de-allocate by the compiler. I'd have to read up though.

Quote
...Thus, you need to understand the nature of the problem you’re dealing with as well as the platform you’re going to solve it on.

Bingo, I do understand that in certain science disciplines this is entirely necessary. The shift seems to be going in the direction of offloading repetitive tasks like those onto GPU's, which have their own languages and compilers to optimise their efficiency, and are very efficient at massively parallel processing (I hear).

Quote
I’m not saying that you have to write your entire program in highly optimised code.  That would be a waste of much effort.  However, where the nature of the problem warrants it, you should write the critical parts in carefully optimised code such as Assembler, and then link the assembled object code into the rest of your program using the high-level-language development environment (some of which allow you to make use of inline Assembler code).

For us cross-platform guys it's not usually an option. The case would have to be really extreme for us to consider having specialised asm code for every architecture. And hence this discussion.... horses for courses. But lemme let you in on a little secret: We DO have an atomic integer implemented in asm for every platform we support, it's a shitload more efficient than locking when it's applicable.
Logged
BoogieMonster
NP complete
Hero Member
*****

Skeptical ability: +19/-1
Offline Offline

Posts: 3078



« Reply #7 on: May 31, 2011, 16:51:11 PM »

.Net usually “compiles” to MSIL, not native (x86) code.  MSIL is a bit like Java’s byte code in that it requires a platform-specific runtime environment.

'Luthon64

To my shock I realised, upon touching a windows compiler again the other day, that by default the new versions of MSVC++ compile to byte-code too! No kidding, I fire up the exe and .net launches in the background? what the hell? Luckily I found a way to turn it back to "good old fashioned" binary building. Undecided
Logged
Mefiante
Defollyant Iconoclast
Hero Member
*****

Skeptical ability: +61/-9
Offline Offline

Posts: 3749


In solidarity with rwenzori: Κοπρος φανεται


WWW
« Reply #8 on: May 31, 2011, 18:12:02 PM »

Maybe not on a Cray but it's common to have PC's or servers (and those sometimes stick around for a while) where the memory is 1/2 the clock speed of the CPU or less.
But that’s just my point:  If you properly understand the CPU’s prefetching & trace caching, pipelining and memory caching, and you schedule your instruction streams properly over the entire stretch of critical code, these clock penalties simply won’t apply at all because there’s nothing the CPU has to wait for.  Everything’s in the queue/cache already by the time it is needed.  Put another way, upgrading to faster RAM won’t do much for the performance of such optimised code.  (Of course, in practice it’s difficult to achieve 100% CPU efficiency, but 95%+ is doable.)

The codes I’m referring to are normally highly specialised and are used by a small fraternity only.  The programmer knows what the target hardware is (or s/he is in a position specify it).  It would be a different matter if wide platform coverage was required but this isn’t usually so.  Also, code optimisations that work well for one generation of hardware are often well optimised for subsequent generations.  The standout exception to this was Intel’s early P4 CPU.  It ran P-III optimal code slower than the P-III, even at 50% higher clock speed.  At around that time, AMD bit a sizeable chunk out of Intel’s pie.

Another point to note is that there is a certain class of problems that cannot be efficiently parallelised.  That is, throwing a problem of this kind at a computing cluster won’t give appreciably faster results than running it on an autonomous single-CPU box of the same hardware configuration.  So once again the message is that you need to understand the nature of the problem you’re dealing with.

Hand rolling assembly has application in specialised fields like this where you know your hardware and OS intimately, I completely agree. But for the "general purpouse guy" like me, even for server software, it's usually about optimising DB access, network access, and the like. Not so much about cycles. So we're just coming from 2 different applications. And my point is a bit that, the compiler writers cater to the application guys more, in my opinion.
Agreed in all respects, except to reiterate that even for general-purpose programming, it’s still a real benefit if the programmer understands what the high-level code’s going to be doing on the CPU.

One of the most horrendous bits of OOP I’ve ever seen was actually done by a computer science graduate.  In one function that was typically called many millions of times, there was a “while” loop.  Within the scope of this “while” construct, an object of a certain class was instantiated for the sake of one method that was needed, and then dutifully destroyed again.  The object in question wasn’t used anywhere else in the code, either directly or via descendant classes.  When you encounter something so obviously inept, you have to wonder how on earth the offender managed to graduate.

'Luthon64
Logged
rwenzori
Sniper
Sr. Member
****

Skeptical ability: +7/-1
Offline Offline

Posts: 403


Merda accidit.


« Reply #9 on: May 31, 2011, 18:17:14 PM »

so, ummm.... How many programmers are on this forum???  Undecided

Anyone for COBOL?  Wink

Like Mandarb, I'm not at the level of assembler ( way too lazy ) and I never got my head around C or C++, but I have made my living programming in BASIC ( post-COBOL LOL! ), contract programming and selling shareware, way back when. Now just hobby projects for me and friends.
Logged
benguela
Full Member
***

Skeptical ability: +3/-0
Offline Offline

Posts: 223


An infinitesimal subset of the observable universe


benguela
WWW
« Reply #10 on: May 31, 2011, 18:18:12 PM »

There are certain types of specialised scientific/engineering problems whose numerical treatments are intensively repetitive, e.g. finite element, finite difference and optimisation problems.  

I still use Fortran for much of this.

Logged
Rigil Kent
Clotting Factor
Hero Member
*****

Skeptical ability: +19/-3
Offline Offline

Posts: 2460


Three men make a tiger.


« Reply #11 on: May 31, 2011, 18:25:15 PM »

In primary school, I used to make a tiny unconvincing turtle cross the screen, turn left and go beep.
But with the arrival of GW-BASIC and the XT somewhere in the 80's, the sky became the limit. The jewel in the crown of my programming career was a sequence that caused the computer to break wind when switched on. Ah ... those were the days! Roll Eyes

Mintaka
Logged
Mefiante
Defollyant Iconoclast
Hero Member
*****

Skeptical ability: +61/-9
Offline Offline

Posts: 3749


In solidarity with rwenzori: Κοπρος φανεται


WWW
« Reply #12 on: May 31, 2011, 18:41:47 PM »

I still use FORTRAN for much of this.
That’s fine if you’re happy with standalone console applications and hand-prepped input data files (where applicable).  As with every high-level language, FORTRAN compilers do not necessarily produce optimal code.

If it’s your aim to integrate FORTRAN object code into a program much of which is written in another high-level language, there are some problems concerning calling conventions and parameter passing.  Moreover, if your code passes matrices (2-D arrays) between a different high-level language and FORTRAN, you are inviting disaster.  In FORTRAN, matrices are column dominant, whereas they are row-dominant in all other high-level languages.  This means that you have to reorder your matrices explicitly both before and after the FORTRAN code operates on them.  Many an unwary scientific programmer has come a totally baffled cropper on this subtle point.

'Luthon64
Logged
benguela
Full Member
***

Skeptical ability: +3/-0
Offline Offline

Posts: 223


An infinitesimal subset of the observable universe


benguela
WWW
« Reply #13 on: May 31, 2011, 19:08:38 PM »

If it’s your aim to integrate FORTRAN object code into a program much of which is written in another high-level language, there are some problems concerning calling conventions and parameter passing.  Moreover, if your code passes matrices (2-D arrays) between a different high-level language and FORTRAN, you are inviting disaster.  In FORTRAN, matrices are column dominant, whereas they are row-dominant in all other high-level languages.  This means that you have to reorder your matrices explicitly both before and after the FORTRAN code operates on them.  Many an unwary scientific programmer has come a totally baffled cropper on this subtle point.


Sjoe, thanks but no thanks, FORTRAN and Perl serve my needs just fine. Running symplectic integrators overnight on the server is "good enough". Injecting FORTRAN into what? C++, urrgg, heaven forbid Wink Assembler into PASCAL however is fun, remember the good ol' days of 1k - 4k demo comps ?

Logged
Mefiante
Defollyant Iconoclast
Hero Member
*****

Skeptical ability: +61/-9
Offline Offline

Posts: 3749


In solidarity with rwenzori: Κοπρος φανεται


WWW
« Reply #14 on: May 31, 2011, 19:43:49 PM »

[R]emember the good ol' days of 1k - 4k demo comps ?
The last I’ve heard on the subject (I’ve been out of this game for several years now), there was the 64k Challenge where competitors (individuals and teams) effectively got locked into a hotel room for a whole week together with some PCs bristling with low-level tools and assorted programming environments.  At the end of the week they had to produce a code image of 64 kB or smaller.  Code compactors/decompressors are OK as long as they are coded by the programmer(s).  The slickest app would win quite a prestigious prize.  Rumour has it that the standard diet for the competitors is pizzas and pancakes:  it’s the only food they could slip under the door… Wink

Then, for the diehard DOS/Real Mode gurus, there was the 256 byte challenge.  I don’t know if that one’s still going but the objective was to produce the coolest app with a code image of 256 bytes or less (inevitably, a .COM file).  Some of the things these guys did in that tiny space is simply amazing.  For comparison, the smallest possible MS Windows GUI program is 512 bytes in size, and all it does is something trivial like pop up a “Hello World!” message box.  It needs to be assembled and linked by hand (which is a fancy way of saying that you need to write out the binary by hand) because the smallest PE section boundary that linkers recognise is 1 kB (= 1,024 bytes).

'Luthon64
Logged
Pages: [1] 2 3 4   Go Up
  Print  


 
Jump to:  

Powered by SMF 1.1.11 | SMF © 2006-2009, Simple Machines LLC
Page created in 0.976 seconds with 23 sceptic queries.
Google visited last this page May 16, 2019, 21:58:01 PM
Privacy Policy