Archive for the ‘Free Software’ Category

20 Free Data Recovery Software Tools (September 2019)

Many free data recovery programs exist that can help recover your accidentally deleted files. These file recovery programs can help you recover, or "undelete" files on your computer.

Data recovery software is just one way to go. See How to Recover Deleted Files for a complete tutorial, including how to avoid common pitfalls during the file recovery process.

Undelete files you thought were gone forever with any one of these freeware data recovery tools including document, video, images, video, music/audio files, and more.

Portable option is available

Lots of advanced options

A wizard walkthrough makes it easy to use

Works on most Windows operating systems

Recuva is the very best free data recovery software tool available, hands down. It's very easy to use but has many optional advanced features as well.

Recuva can recover files from hard drives, external drives (USB drives, etc.), BD/DVD/CD discs, and memory cards. Recuva can even undelete files from your iPod!

Undeleting a file with Recuva is as easy as deleting one! I highly recommend that you try Recuva first if you need to recover a file.

Recuva will undelete files in Windows 10, Windows 8 & 8.1, 7, Vista, XP, Server 2008/2003, and older Windows versions like 2000, NT, ME and 98. 64-bit Windows versions are also supported. There is also a 64-bit version Recuva available.

Piriform provides both an installable and a portable version of Recuva. I tested file recovery with Recuva v1.53.1087 using their portable version on Windows 8.1.

Two ways to view the list of deleted files

Supports running as a portable version

Scans NTFS and FAT12/16/32 file systems

It's easy to see whether the file can be recovered well

Free for home use only, not commercial/business

Hasn't been updated since 2016

Puran File Recovery is one of the better free data recovery programs I've seen. It's very easy to use, will scan any drive that Windows sees, and has a lot of advanced options if you need them.

One particular thing to note is thatPuran File Recovery identified more files on my test machine than most other tools, so be sure to give this one a shot in addition to Recuva if it didn't find what you were looking for.

Puran File Recovery will even recover lost partitions if they haven't been overwritten yet.

Puran File Recovery works with Windows 10, 8, 7, Vista, and XP. It'salso available in a portable formfor both 32-bit and 64-bit versions of Windows, soit doesn't require installation.

Organizes deleted files by category for easier viewing

Lets you filter the results by size and/or date

Supports a quick scan and a deep scan mode

Works with several different file systems

Lets you recover only 500 MB of data

Has to be installed to the HDD (no portable version)

You can't see how recoverable a file is before restoration

Disk Drill is an excellent free data recovery program not only because of its features but also due to theverysimple design, making it almost impossible to get confused.

According to their website, Disk Drill can recover data (up to 500 MB)from "virtually any storage device," such as internal andexternal hard drives, USB devices, memory cards, and iPods.

Disk Drill can also preview files before recovering them, pause scans and resume them later, perform partition recovery, back up an entire drive, filter files by date or size, run a quick scan versus a full scan for faster results, and save scan results so you can easily import them again to recovery deleted files at a later time.

Disk Drill works with Windows 10, 8, and 7, as well as macOS 10.10 and newer.

PandoraRecovery was another file recovery program but it now exists as Disk Drill. If you're looking for that program, you can find the lastreleased version on Softpedia.

Explains very clearly whether the file will recover fully

The download file is small

Viewing the list of deleted files is easy and user friendly

The program hasn't updated in a long time

Can't be used portably, so you have to install it

Setup attempts to install another program with Glary Undelete

Glary Undelete is an excellent free file recovery program. It's very easy to use and has one of the better user interfaces that I've seen.

The biggest advantages in Glary Undelete include the easy "Folders" view, a Explorer-style view of recoverable files, and a prominent "State" indication for each file, suggesting how likely a successful file recovery will be.

One disadvantage of Glary Undelete is that installation is required before you can use it. Another is that you're asked to install a toolbar, but you can, of course, decline if you don't want it. Aside from those facts, Glary Undelete is top notch.

Glary Undelete can recover files from hard drives and any removable media you might have including memory cards, USB drives, etc.

Glary Undelete is said to work in Windows 7, Vista, and XP, but it also works fine in Windows 10, Windows 8, and versions older than Windows XP. I tested Glary Undelete v5.0 in Windows 7.

It's really easy to use

Works from any portable location like a flash drive

You can search for deleted files by file extension and file name

Lets you restore more than one file simultaneously

Supports only two file systems (however, they are the most popular)

You can't preview an image file before restoring it

Unlike most file recovery tools, this one doesn't let you see how successful the file recovery will be

SoftPerfect File Recovery is another superb file undelete program. It's very easy to search for recoverable files. Anyone should be able to use this program with very little trouble.

SoftPerfect File Recovery will undelete files from hard drives, memory cards, etc. Any device on your PC that stores data (except for your CD/DVD drive) should be supported.

SoftPerfect File Recovery is a small, 500 KB, standalone file, making the program very portable. Feel free to run File Recovery from a USB drive or floppy disk. Scroll down a bit on the download page to find it.

Windows 8, 7, Vista, XP, Server 2008 & 2003, 2000, NT, ME, 98, and 95 are all supported. According to SoftPerfect, 64-bit versions of Windowsoperating systemsare also supported.

I tested SoftPerfect File Recovery v1.2 in Windows 10 without any issues.

You can back up the scan results to restore files later without having to rescan the whole drive

Works on Windows and macOS

Lets you sort files by file type, date it was removed, and name

File recovery is easy because you can browse the folders like you would in Explorer

Supports previewing files prior to restoration

EaseUS Data Recovery Wizard is another great file undelete program. Recovering files is very easy to do with just a few clicks.

My favorite aspect of EaseUS Data Recovery Wizard is that the user interface is structured much like Windows Explorer. While that may not be everyone's ideal way to display files, it's a very familiar interface that most people are comfortable with.

EaseUS Data Recovery Wizard will undelete files from hard drives, optical drives, memory cards, iOS devices, and pretty much anything else that Windows sees as a storage device. It also does partition recovery!

Please know that Data Recovery Wizard will only recover a total of 500 MB of data before you'll need to upgrade (or up to 2 GB if you use the share button in the program to post about the software on Facebook, Twitter, or Google+).

I almost didn't include this program because of that limitation but since most situations call for undeleting much less than that, I'll let it slide.

Data Recovery Wizard supports Mac and Windows 10, 8, 7, Vista, and XP, as well as Windows Server 2012, 2008, and 2003.

Scans for deleted files quickly

Colored circles make it easy to quickly see whether a file will have a good or poor chance at recovering fully

There's a portable option

Works with Windows 10 through XP

When undeleting files, the original folder structure is not retained

Doesn't work on Mac or Linux

Wise Data Recovery is a free undelete program that's really simple to use.

The program installed very quickly and scanned my PC in record time. Wise Data Recovery can scan various USB devices like memory cards and other removal devices.

An instant search function makes it really quick and easy to search for deleted files that Wise Data Recovery has found. A Recoverability column shows the likelihood of a file being recovered withGood, Poor, Very Poor,orLost. Just right-click to restore a file.

Wise Data Recovery works with Windows 10, 8, 7, Vista, and XP. There's also a portable version available.

Really easy to use

Portable program

Several ways to sort the results

Can search for empty deleted files

Lets you overwrite the deleted data

Supports up to Windows XP (officially; but still works on some newer OSs)

Doesn't work in Windows 8

Can't restore a whole folder at once, just single files

Doesn't say how recoverable the file is before you restore it

The Restoration data recovery program is similar to the other free undelete apps on this list.

The thing I like most about Restoration is how incredibly simple it is to recover files. There are no cryptic buttons or complicated file recovery procedures everything you need is on one, easy to understand program window.

Restoration can recover files from hard drives, memory cards, USB drives, and other external drives.

Like some of the other popular data recovery tools on this list, Restoration is small and does not need to be installed, giving it the flexibility to be run from a floppy disk or USB drive.

Restoration is said to support Windows Vista, XP, 2000, NT, ME, 98, and 95. I successfully tested it with Windows 10 and Windows 7, and didn't run into any problems. However, v3.2.13 didn't work for me in Windows 8.

Can undelete files from a variety of storage devices

Simple user interface that isn't hard to understand

There's a portable option

Helpful filtering and sorting options

Restores entire folders at once, as well as single or multiple files

Lets you know how successful the recovery will be before starting

FreeUndelete is self-explanatory it's free and it undeletes files! It'svery similar to other undelete utilities around this rank on our list.

The major advantage of FreeUndelete is it's easy to use interface and "folder drill down" functionality (i.e. files available for recovery are not shown in a big, unmanageable listing).

See the rest here:
20 Free Data Recovery Software Tools (September 2019)

Free CAD Software – ExpressPCB

Start by downloading our NEW free CAD software ExpressPCB Plus! It includes ExpressSCH Classic for drawing schematics and ExpressPCB Plus for circuit board layout.

Both programs are completely free, fully functional and easily installed with a singleInstallShield setup program.

Learning to use our software is fast because of its standardized Windows user interface.

We recommend that you begin your project by drawing a schematic. While not required, it will save you time when designing your PCB.

Drawing a schematic with theExpressSCH program is as easy as placing the components on the page and wiring the pins together.

More about ExpressSCH Schematic

Designing 2 or 4 layer boards using the ExpressPCB Plus program is very simple. Start by inserting the component footprints, then drag them into position. Next, connect the pins by drawing the traces.

More about ExpressPCB Plus Layout

After completing your layout, you can determine how much it will cost.

The ExpressPCB Plus program displays the exact manufacturing cost by selectingthe Order Boards Via the File menu.

Now the fun part. You order your PC boards directly from theExpressPCB layout program. Here is how:

1. Run ExpressPCB Plus and select Order Boards Via the File menu.

2. In the order form fill in your name, address, email address and the quantity of boards you need.

3. To pay for the boards, we bill your credit card the exact amount shown by the Compute Boards Costs command. But dont worry, we encrypt your credit card number, along with the entire order before it is sent over the Internet.

4. Press the Place Order button to submit your order. It is sent directly to the ExpressPCB server.

More PCB Ordering Information

More here:
Free CAD Software - ExpressPCB

All You Like – Download ALLmost Everything YOU LIKE

Psiphon is a free and open-source Internet censorship circumvention tool that uses a combination of secure communication and obfuscation technologies (VPN, SSH, and HTTP Proxy). Psiphon is specifically designed to support users in countries considered to be enemies of the Internet which operates systems and technologies designed to assist Internet users to securely bypass the content-filtering systems used by governments to impose censorship of the Internet.

READ MORE ON NEXT PAGE

CLICK HERE TO DOWNLOAD

Posted by maxdugan

The iron-fisted Akhandanand Tripathi is a millionaire carpet exporter and the mafia don of Mirzapur. His son, Munna, is an unworthy, power hungry heir who will stop at nothing to inherit his fathers legacy. An incident at a wedding procession forces him to cross paths with Ramakant Pandit, an upstanding lawyer, and his sons, Guddu and Bablu.

CLICK HERE TO DOWNLOAD

Posted by maxdugan

Investigative journalist Eddie Brock attempts a comeback following a scandal, but accidentally becomes the host of an alien symbiote that gives him a violent super alter-ego: Venom. Soon, he must rely on his newfound powers to protect the world from a shadowy organisation looking for a symbiote of their own.

CLICK HERE TO DOWNLOAD

Posted by maxdugan

Decca Broadway proudly presents the original cast recording of WICKED, Broadways most talked about new musical. The box office is already over $10 Million! With a score by Stephen Schwartz.

CLICK HERE TO DOWNLOAD

Posted by maxdugan

The Winter Festival is coming and Po is asked to host the great exclusive formal banquet for all the Masters of Kung Fu. However, the occasion is on the same night as his fathers restaurants own party and Mr. Ping, upset at his sons absence, will not cancel it to cook for the masters at Pos request. Burdened by his fathers imposed guilt about his conflicting responsibilities, Po finds all the preparations a dispiriting struggle. However, the solution comes from where he least expects it even as the panda must decide who truly needs him more on the big night.

CLICK HERE TO DOWNLOAD

Posted by maxdugan

A war between heaven and hell is raging on Earth and hormonal fury is raging in Isseis pants. Enter curvy redhead Rias, president of The Occult Research Club: a club that doesnt actually research the occult. They are the occult and Rias is a Devil!

CLICK HERE TO DOWNLOAD

Posted by maxdugan

After having suffered a heart-attack, a 59-year-old carpenter must fight the bureaucratic forces of the system in order to receive Employment and Support Allowance.

CLICK HERE TO DOWNLOAD

Posted by maxdugan

The high-adrenaline adventures of White Watch, a team of London firefighters. Leading the crew is Kev, a good man injured and betrayed during the worst fire of his career. Standing by Kevs side as he returns to work is his gutsy girlfriend Trish and his cocksure friend and fellow firefighter Mal. Other members of the crew include the fearless Ziggy and the mysterious new boy Dennis.

CLICK HERE TO DOWNLOAD

Posted by maxdugan

Tensions mount between the sons of Ragnar Lothbrok as the Vikings continue to threaten the very heart of England.

CLICK HERE TO DOWNLOAD

Posted by maxdugan

The worlds most beloved fairy tales are re-imagined as a dark and twisted psychological thriller set in modern day New York City, the first season of this serialized drama interweaves The Three Little Pigs, Little Red Riding Hood, and Jack and the Beanstalk into an epic and subversive tale of love, loss, greed, revenge, and murder.

CLICK HERE TO DOWNLOAD

Posted by maxdugan

The film revolves an undercover police officer who attempts to take down a drug trafficking syndicate from the inside.

CLICK HERE TO DOWNLOAD

Posted by maxdugan

Mac (Seth Rogen) and Kelly (Rose Byrne) are ready to make the final move into adulthood. But just as they thought they have reclaimed the neighborhood, they learn that their new neighbors are even more out of control than the last. To evict them, they will need help from their ex neighbor (Zac Efron).

CLICK HERE TO DOWNLOAD

Posted by maxdugan

Continue reading here:
All You Like - Download ALLmost Everything YOU LIKE

Free Software Download – Get Your Daily Free Download Now

Free Software direct to your inbox every day! Sorry we don't have a free deal here today, but we have a solution for you. Sign up for our Daily Bits email and never miss out on anything we feature. Join the over 200,000 subscribers who receive great deals on free software every single day. We promise to be careful with your email address, and only bring you deals that you want. We feature discounts from over 2000 software publishers, with many different discounts and giveaways for both mac and pc software. Sign up and start getting the savings today.

The email you entered is already receiving Daily Bits Emails!

The BitsDuJour community finds great software discounts every day, check out our Software Discounts Forum and find more deals! If you're interested in learning more about how we can bring you these great deals then read about how our giveaway of the day deals run without a catch.

Video is a great tool to get your message across the web. Here's the problem, you don't have a budget, and you don't know anything about video editing.

See original here:
Free Software Download - Get Your Daily Free Download Now

The Free Lunch Is Over: A Fundamental Turn Toward …

By Herb Sutter

The biggest sea change in software development since the OO revolution is knocking at the door, and its name is Concurrency.

This article appeared in Dr. Dobb's Journal, 30(3), March 2005. A much briefer version under the title "The Concurrency Revolution" appeared in C/C++ Users Journal, 23(2), February 2005.

Update note: The CPU trends graph last updated August 2009 to include current data and show the trend continues as predicted. The rest of this article including all text is still original as first posted here in December 2004.

Your free lunch will soon be over. What can you do about it? What are you doing about it?

The major processor manufacturers and architectures, from Intel and AMD to Sparc and PowerPC, have run out of room with most of their traditional approaches to boosting CPU performance. Instead of driving clock speeds and straight-line instruction throughput ever higher, they are instead turning en masse to hyperthreading and multicore architectures. Both of these features are already available on chips today; in particular, multicore is available on current PowerPC and Sparc IV processors, and is coming in 2005 from Intel and AMD. Indeed, the big theme of the 2004 In-Stat/MDR Fall Processor Forum was multicore devices, as many companies showed new or updated multicore processors. Looking back, its not much of a stretch to call 2004 the year of multicore.

And that puts us at a fundamental turning point in software development, at least for the next few years and for applications targeting general-purpose desktop computers and low-end servers (which happens to account for the vast bulk of the dollar value of software sold today). In this article, Ill describe the changing face of hardware, why it suddenly does matter to software, and how specifically the concurrency revolution matters to you and is going to change the way you will likely be writing software in the future.

Arguably, the free lunch has already been over for a year or two, only were just now noticing.

Theres an interesting phenomenon thats known as Andy giveth, and Bill taketh away. No matter how fast processors get, software consistently finds new ways to eat up the extra speed. Make a CPU ten times as fast, and software will usually find ten times as much to do (or, in some cases, will feel at liberty to do it ten times less efficiently). Most classes of applications have enjoyed free and regular performance gains for several decades, even without releasing new versions or doing anything special, because the CPU manufacturers (primarily) and memory and disk manufacturers (secondarily) have reliably enabled ever-newer and ever-faster mainstream systems. Clock speed isnt the only measure of performance, or even necessarily a good one, but its an instructive one: Were used to seeing 500MHz CPUs give way to 1GHz CPUs give way to 2GHz CPUs, and so on. Today were in the 3GHz range on mainstream computers.

The key question is: When will it end? After all, Moores Law predicts exponential growth, and clearly exponential growth cant continue forever before we reach hard physical limits; light isnt getting any faster. The growth must eventually slow down and even end. (Caveat: Yes, Moores Law applies principally to transistor densities, but the same kind of exponential growth has occurred in related areas such as clock speeds. Theres even faster growth in other spaces, most notably the data storage explosion, but that important trend belongs in a different article.)

If youre a software developer, chances are that you have already been riding the free lunch wave of desktop computer performance. Is your applications performance borderline for some local operations? Not to worry, the conventional (if suspect) wisdom goes; tomorrows processors will have even more throughput, and anyway todays applications are increasingly throttled by factors other than CPU throughput and memory speed (e.g., theyre often I/O-bound, network-bound, database-bound). Right?

Right enough, in the past. But dead wrong for the foreseeable future.

The good news is that processors are going to continue to become more powerful. The bad news is that, at least in the short term, the growth will come mostly in directions that do not take most current applications along for their customary free ride.

Over the past 30 years, CPU designers have achieved performance gains in three main areas, the first two of which focus on straight-line execution flow:

clock speed

execution optimization

cache

Increasing clock speed is about getting more cycles. Running the CPU faster more or less directly means doing the same work faster.

Optimizing execution flow is about doing more work per cycle. Todays CPUs sport some more powerful instructions, and they perform optimizations that range from the pedestrian to the exotic, including pipelining, branch prediction, executing multiple instructions in the same clock cycle(s), and even reordering the instruction stream for out-of-order execution. These techniques are all designed to make the instructions flow better and/or execute faster, and to squeeze the most work out of each clock cycle by reducing latency and maximizing the work accomplished per clock cycle.

Chip designers are under so much pressure to deliver ever-faster CPUs that theyll risk changing the meaning of your program, and possibly break it, in order to make it run faster

Brief aside on instruction reordering and memory models: Note that some of what I just called optimizations are actually far more than optimizations, in that they can change the meaning of programs and cause visible effects that can break reasonable programmer expectations. This is significant. CPU designers are generally sane and well-adjusted folks who normally wouldnt hurt a fly, and wouldnt think of hurting your code normally. But in recent years they have been willing to pursue aggressive optimizations just to wring yet more speed out of each cycle, even knowing full well that these aggressive rearrangements could endanger the semantics of your code. Is this Mr. Hyde making an appearance? Not at all. That willingness is simply a clear indicator of the extreme pressure the chip designers face to deliver ever-faster CPUs; theyre under so much pressure that theyll risk changing the meaning of your program, and possibly break it, in order to make it run faster. Two noteworthy examples in this respect are write reordering and read reordering: Allowing a processor to reorder write operations has consequences that are so surprising, and break so many programmer expectations, that the feature generally has to be turned off because its too difficult for programmers to reason correctly about the meaning of their programs in the presence of arbitrary write reordering. Reordering read operations can also yield surprising visible effects, but that is more commonly left enabled anyway because it isnt quite as hard on programmers, and the demands for performance cause designers of operating systems and operating environments to compromise and choose models that place a greater burden on programmers because that is viewed as a lesser evil than giving up the optimization opportunities.

Finally, increasing the size of on-chip cache is about staying away from RAM. Main memory continues to be so much slower than the CPU that it makes sense to put the data closer to the processorand you cant get much closer than being right on the die. On-die cache sizes have soared, and today most major chip vendors will sell you CPUs that have 2MB and more of on-board L2 cache. (Of these three major historical approaches to boosting CPU performance, increasing cache is the only one that will continue in the near term. Ill talk a little more about the importance of cache later on.)

Okay. So what does this mean?

A fundamentally important thing to recognize about this list is that all of these areas are concurrency-agnostic. Speedups in any of these areas will directly lead to speedups in sequential (nonparallel, single-threaded, single-process) applications, as well as applications that do make use of concurrency. Thats important, because the vast majority of todays applications are single-threaded, for good reasons that Ill get into further below.

Of course, compilers have had to keep up; sometimes you need to recompile your application, and target a specific minimum level of CPU, in order to benefit from new instructions (e.g., MMX, SSE) and some new CPU features and characteristics. But, by and large, even old applications have always run significantly fastereven without being recompiled to take advantage of all the new instructions and features offered by the latest CPUs.

That world was a nice place to be. Unfortunately, it has already disappeared.

CPU performance growth as we have known it hit a wall two years ago. Most people have only recently started to notice.

You can get similar graphs for other chips, but Im going to use Intel data here. Figure 1 graphs the history of Intel chip introductions by clock speed and number of transistors. The number of transistors continues to climb, at least for now. Clock speed, however, is a different story.

Figure 1: Intel CPU Introductions (graph updated August 2009; article text original from December 2004)

Around the beginning of 2003, youll note a disturbing sharp turn in the previous trend toward ever-faster CPU clock speeds. Ive added lines to show the limit trends in maximum clock speed; instead of continuing on the previous path, as indicated by the thin dotted line, there is a sharp flattening. It has become harder and harder to exploit higher clock speeds due to not just one but several physical issues, notably heat (too much of it and too hard to dissipate), power consumption (too high), and current leakage problems.

Quick: Whats the clock speed on the CPU(s) in your current workstation? Are you running at 10GHz? On Intel chips, we reached 2GHz a long time ago (August 2001), and according to CPU trends before 2003, now in early 2005 we should have the first 10GHz Pentium-family chips. A quick look around shows that, well, actually, we dont. Whats more, such chips are not even on the horizonwe have no good idea at all about when we might see them appear.

Well, then, what about 4GHz? Were at 3.4GHz alreadysurely 4GHz cant be far away? Alas, even 4GHz seems to be remote indeed. In mid-2004, as you probably know, Intel first delayed its planned introduction of a 4GHz chip until 2005, and then in fall 2004 it officially abandoned its 4GHz plans entirely. As of this writing, Intel is planning to ramp up a little further to 3.73GHz in early 2005 (already included in Figure 1 as the upper-right-most dot), but the clock race really is over, at least for now; Intels and most processor vendors future lies elsewhere as chip companies aggressively pursue the same new multicore directions.

Well probably see 4GHz CPUs in our mainstream desktop machines someday, but it wont be in 2005. Sure, Intel has samples of their chips running at even higher speeds in the labbut only by heroic efforts, such as attaching hideously impractical quantities of cooling equipment. You wont have that kind of cooling hardware in your office any day soon, let alone on your lap while computing on the plane.

There aint no such thing as a free lunch. R. A. Heinlein, The Moon Is a Harsh Mistress

Does this mean Moores Law is over? Interestingly, the answer in general seems to be no. Of course, like all exponential progressions, Moores Law must end someday, but it does not seem to be in danger for a few more years yet. Despite the wall that chip engineers have hit in juicing up raw clock cycles, transistor counts continue to explode and it seems CPUs will continue to follow Moores Law-like throughput gains for some years to come.

So a dual-core CPU that combines two 3GHz cores practically offers 6GHz of processing power. Right?

Wrong. Even having two threads running on two physical processors doesnt mean getting two times the performance. Similarly, most multi-threaded applications wont run twice as fast on a dual-core box. They should run faster than on a single-core CPU; the performance gain just isnt linear, thats all.

Why not? First, there is coordination overhead between the cores to ensure cache coherency (a consistent view of cache, and of main memory) and to perform other handshaking. Today, a two- or four-processor machine isnt really two or four times as fast as a single CPU even for multi-threaded applications. The problem remains essentially the same even when the CPUs in question sit on the same die.

Second, unless the two cores are running different processes, or different threads of a single process that are well-written to run independently and almost never wait for each other, they wont be well utilized. (Despite this, I will speculate that todays single-threaded applications as actually used in the field could actually see a performance boost for most users by going to a dual-core chip, not because the extra core is actually doing anything useful, but because it is running the adware and spyware that infest many users systems and are otherwise slowing down the single CPU that user has today. I leave it up to you to decide whether adding a CPU to run your spyware is the best solution to that problem.)

If youre running a single-threaded application, then the application can only make use of one core. There should be some speedup as the operating system and the application can run on separate cores, but typically the OS isnt going to be maxing out the CPU anyway so one of the cores will be mostly idle. (Again, the spyware can share the OSs core most of the time.)

The key difference, which is the heart of this article, is that the performance gains are going to be accomplished in fundamentally different ways for at least the next couple of processor generations. And most current applications will no longer benefit from the free ride without significant redesign.

For the near-term future, meaning for the next few years, the performance gains in new chips will be fueled by three main approaches, only one of which is the same as in the past. The near-term future performance growth drivers are:

hyperthreading

multicore

cache

Hyperthreading is about running two or more threads in parallel inside a single CPU. Hyperthreaded CPUs are already available today, and they do allow some instructions to run in parallel. A limiting factor, however, is that although a hyper-threaded CPU has some extra hardware including extra registers, it still has just one cache, one integer math unit, one FPU, and in general just one each of most basic CPU features. Hyperthreading is sometimes cited as offering a 5% to 15% performance boost for reasonably well-written multi-threaded applications, or even as much as 40% under ideal conditions for carefully written multi-threaded applications. Thats good, but its hardly double, and it doesnt help single-threaded applications.

Multicore is about running two or more actual CPUs on one chip. Some chips, including Sparc and PowerPC, have multicore versions available already. The initial Intel and AMD designs, both due in 2005, vary in their level of integration but are functionally similar. AMDs seems to have some initial performance design advantages, such as better integration of support functions on the same die, whereas Intels initial entry basically just glues together two Xeons on a single die. The performance gains should initially be about the same as having a true dual-CPU system (only the system will be cheaper because the motherboard doesnt have to have two sockets and associated glue chippery), which means something less than double the speed even in the ideal case, and just like today it will boost reasonably well-written multi-threaded applications. Not single-threaded ones.

Finally, on-die cache sizes can be expected to continue to grow, at least in the near term. Of these three areas, only this one will broadly benefit most existing applications. The continuing growth in on-die cache sizes is an incredibly important and highly applicable benefit for many applications, simply because space is speed. Accessing main memory is expensive, and you really dont want to touch RAM if you can help it. On todays systems, a cache miss that goes out to main memory often costs 10 to 50 times as much getting the information from the cache; this, incidentally, continues to surprise people because we all think of memory as fast, and it is fast compared to disks and networks, but not compared to on-board cache which runs at faster speeds. If an applications working set fits into cache, were golden, and if it doesnt, were not. That is why increased cache sizes will save some existing applications and breathe life into them for a few more years without requiring significant redesign: As existing applications manipulate more and more data, and as they are incrementally updated to include more code for new features, performance-sensitive operations need to continue to fit into cache. As the Depression-era old-timers will be quick to remind you, Cache is king.

(Aside: Heres an anecdote to demonstrate space is speed that recently hit my compiler team. The compiler uses the same source base for the 32-bit and 64-bit compilers; the code is just compiled as either a 32-bit process or a 64-bit one. The 64-bit compiler gained a great deal of baseline performance by running on a 64-bit CPU, principally because the 64-bit CPU had many more registers to work with and had other code performance features. All well and good. But what about data? Going to 64 bits didnt change the size of most of the data in memory, except that of course pointers in particular were now twice the size they were before. As it happens, our compiler uses pointers much more heavily in its internal data structures than most other kinds of applications ever would. Because pointers were now 8 bytes instead of 4 bytes, a pure data size increase, we saw a significant increase in the 64-bit compilers working set. That bigger working set caused a performance penalty that almost exactly offset the code execution performance increase wed gained from going to the faster processor with more registers. As of this writing, the 64-bit compiler runs at the same speed as the 32-bit compiler, even though the source base is the same for both and the 64-bit processor offers better raw processing throughput. Space is speed.)

But cache is it. Hyperthreading and multicore CPUs will have nearly no impact on most current applications.

So what does this change in the hardware mean for the way we write software? By now youve probably noticed the basic answer, so lets consider it and its consequences.

In the 1990s, we learned to grok objects. The revolution in mainstream software development from structured programming to object-oriented programming was the greatest such change in the past 20 years, and arguably in the past 30 years. There have been other changes, including the most recent (and genuinely interesting) naissance of web services, but nothing that most of us have seen during our careers has been as fundamental and as far-reaching a change in the way we write software as the object revolution.

Until now.

Starting today, the performance lunch isnt free any more. Sure, there will continue to be generally applicable performance gains that everyone can pick up, thanks mainly to cache size improvements. But if you want your application to benefit from the continued exponential throughput advances in new processors, it will need to be a well-written concurrent (usually multithreaded) application. And thats easier said than done, because not all problems are inherently parallelizable and because concurrent programming is hard.

I can hear the howls of protest: Concurrency? Thats not news! People are already writing concurrent applications. Thats true. Of a small fraction of developers.

Remember that people have been doing object-oriented programming since at least the days of Simula in the late 1960s. But OO didnt become a revolution, and dominant in the mainstream, until the 1990s. Why then? The reason the revolution happened was primarily that our industry was driven by requirements to write larger and larger systems that solved larger and larger problems and exploited the greater and greater CPU and storage resources that were becoming available. OOPs strengths in abstraction and dependency management made it a necessity for achieving large-scale software development that is economical, reliable, and repeatable.

Concurrency is the next major revolution in how we write software

Similarly, weve been doing concurrent programming since those same dark ages, writing coroutines and monitors and similar jazzy stuff. And for the past decade or so weve witnessed incrementally more and more programmers writing concurrent (multi-threaded, multi-process) systems. But an actual revolution marked by a major turning point toward concurrency has been slow to materialize. Today the vast majority of applications are single-threaded, and for good reasons that Ill summarize in the next section.

By the way, on the matter of hype: People have always been quick to announce the next software development revolution, usually about their own brand-new technology. Dont believe it. New technologies are often genuinely interesting and sometimes beneficial, but the biggest revolutions in the way we write software generally come from technologies that have already been around for some years and have already experienced gradual growth before they transition to explosive growth. This is necessary: You can only base a software development revolution on a technology thats mature enough to build on (including having solid vendor and tool support), and it generally takes any new software technology at least seven years before its solid enough to be broadly usable without performance cliffs and other gotchas. As a result, true software development revolutions like OO happen around technologies that have already been undergoing refinement for years, often decades. Even in Hollywood, most genuine overnight successes have really been performing for many years before their big break.

Concurrency is the next major revolution in how we write software. Different experts still have different opinions on whether it will be bigger than OO, but that kind of conversation is best left to pundits. For technologists, the interesting thing is that concurrency is of the same order as OO both in the (expected) scale of the revolution and in the complexity and learning curve of the technology.

There are two major reasons for which concurrency, especially multithreading, is already used in mainstream software. The first is to logically separate naturally independent control flows; for example, in a database replication server I designed it was natural to put each replication session on its own thread, because each session worked completely independently of any others that might be active (as long as they werent working on the same database row). The second and less common reason to write concurrent code in the past has been for performance, either to scalably take advantage of multiple physical CPUs or to easily take advantage of latency in other parts of the application; in my database replication server, this factor applied as well and the separate threads were able to scale well on multiple CPUs as our server handled more and more concurrent replication sessions with many other servers.

There are, however, real costs to concurrency. Some of the obvious costs are actually relatively unimportant. For example, yes, locks can be expensive to acquire, but when used judiciously and properly you gain much more from the concurrent execution than you lose on the synchronization, if you can find a sensible way to parallelize the operation and minimize or eliminate shared state.

Perhaps the second-greatest cost of concurrency is that not all applications are amenable to parallelization. Ill say more about this later on.

Probably the greatest cost of concurrency is that concurrency really is hard: The programming model, meaning the model in the programmers head that he needs to reason reliably about his program, is much harder than it is for sequential control flow.

Everybody who learns concurrency thinks they understand it, ends up finding mysterious races they thought werent possible, and discovers that they didnt actually understand it yet after all. As the developer learns to reason about concurrency, they find that usually those races can be caught by reasonable in-house testing, and they reach a new plateau of knowledge and comfort. What usually doesnt get caught in testing, however, except in shops that understand why and how to do real stress testing, is those latent concurrency bugs that surface only on true multiprocessor systems, where the threads arent just being switched around on a single processor but where they really do execute truly simultaneously and thus expose new classes of errors. This is the next jolt for people who thought that surely now they know how to write concurrent code: Ive come across many teams whose application worked fine even under heavy and extended stress testing, and ran perfectly at many customer sites, until the day that a customer actually had a real multiprocessor machine and then deeply mysterious races and corruptions started to manifest intermittently. In the context of todays CPU landscape, then, redesigning your application to run multithreaded on a multicore machine is a little like learning to swim by jumping into the deep endgoing straight to the least forgiving, truly parallel environment that is most likely to expose the things you got wrong. Even when you have a team that can reliably write safe concurrent code, there are other pitfalls; for example, concurrent code that is completely safe but isnt any faster than it was on a single-core machine, typically because the threads arent independent enough and share a dependency on a single resource which re-serializes the programs execution. This stuff gets pretty subtle.

The vast majority of programmers today dont grok concurrency, just as the vast majority of programmers 15 years ago didnt yet grok objects

Just as it is a leap for a structured programmer to learn OO (whats an object? whats a virtual function? how should I use inheritance? and beyond the whats and hows, why are the correct design practices actually correct?), its a leap of about the same magnitude for a sequential programmer to learn concurrency (whats a race? whats a deadlock? how can it come up, and how do I avoid it? what constructs actually serialize the program that I thought was parallel? how is the message queue my friend? and beyond the whats and hows, why are the correct design practices actually correct?).

The vast majority of programmers today dont grok concurrency, just as the vast majority of programmers 15 years ago didnt yet grok objects. But the concurrent programming model is learnable, particularly if we stick to message- and lock-based programming, and once grokked it isnt that much harder than OO and hopefully can become just as natural. Just be ready and allow for the investment in training and time, for you and for your team.

(I deliberately limit the above to message- and lock-based concurrent programming models. There is also lock-free programming, supported most directly at the language level in Java 5 and in at least one popular C++ compiler. But concurrent lock-free programming is known to be very much harder for programmers to understand and reason about than even concurrent lock-based programming. Most of the time, only systems and library writers should have to understand lock-free programming, although virtually everybody should be able to take advantage of the lock-free systems and libraries those people produce. Frankly, even lock-based programming is hazardous.)

Okay, back to what it means for us.

1. The clear primary consequence weve already covered is that applications will increasingly need to be concurrent if they want to fully exploit CPU throughput gains that have now started becoming available and will continue to materialize over the next several years. For example, Intel is talking about someday producing 100-core chips; a single-threaded application can exploit at most 1/100 of such a chips potential throughput. Oh, performance doesnt matter so much, computers just keep getting faster has always been a nave statement to be viewed with suspicion, and for the near future it will almost always be simply wrong.

Applications will increasingly need to be concurrent if they want to fully exploit continuing exponential CPU throughput gains

Efficiency and performance optimization will get more, not less, important

Now, not all applications (or, more precisely, important operations of an application) are amenable to parallelization. True, some problems, such as compilation, are almost ideally parallelizable. But others arent; the usual counterexample here is that just because it takes one woman nine months to produce a baby doesnt imply that nine women could produce one baby in one month. Youve probably come across that analogy before. But did you notice the problem with leaving the analogy at that? Heres the trick question to ask the next person who uses it on you: Can you conclude from this that the Human Baby Problem is inherently not amenable to parallelization? Usually people relating this analogy err in quickly concluding that it demonstrates an inherently nonparallel problem, but thats actually not necessarily correct at all. It is indeed an inherently nonparallel problem if the goal is to produce one child. It is actually an ideally parallelizable problem if the goal is to produce many children! Knowing the real goals can make all the difference. This basic goal-oriented principle is something to keep in mind when considering whether and how to parallelize your software.

2. Perhaps a less obvious consequence is that applications are likely to become increasingly CPU-bound. Of course, not every application operation will be CPU-bound, and even those that will be affected wont become CPU-bound overnight if they arent already, but we seem to have reached the end of the applications are increasingly I/O-bound or network-bound or database-bound trend, because performance in those areas is still improving rapidly (gigabit WiFi, anyone?) while traditional CPU performance-enhancing techniques have maxed out. Consider: Were stopping in the 3GHz range for now. Therefore single-threaded programs are likely not to get much faster any more for now except for benefits from further cache size growth (which is the main good news). Other gains are likely to be incremental and much smaller than weve been used to seeing in the past, for example as chip designers find new ways to keep pipelines full and avoid stalls, which are areas where the low-hanging fruit has already been harvested. The demand for new application features is unlikely to abate, and even more so the demand to handle vastly growing quantities of application data is unlikely to stop accelerating. As we continue to demand that programs do more, they will increasingly often find that they run out of CPU to do it unless they can code for concurrency.

There are two ways to deal with this sea change toward concurrency. One is to redesign your applications for concurrency, as above. The other is to be frugal, by writing code that is more efficient and less wasteful. This leads to the third interesting consequence:

3. Efficiency and performance optimization will get more, not less, important. Those languages that already lend themselves to heavy optimization will find new life; those that dont will need to find ways to compete and become more efficient and optimizable. Expect long-term increased demand for performance-oriented languages and systems.

4. Finally, programming languages and systems will increasingly be forced to deal well with concurrency. The Java language has included support for concurrency since its beginning, although mistakes were made that later had to be corrected over several releases in order to do concurrent programming more correctly and efficiently. The C++ language has long been used to write heavy-duty multithreaded systems well, but it has no standardized support for concurrency at all (the ISO C++ standard doesnt even mention threads, and does so intentionally), and so typically the concurrency is of necessity accomplished by using nonportable platform-specific concurrency features and libraries. (Its also often incomplete; for example, static variables must be initialized only once, which typically requires that the compiler wrap them with a lock, but many C++ implementations do not generate the lock.) Finally, there are a few concurrency standards, including pthreads and OpenMP, and some of these support implicit as well as explicit parallelization. Having the compiler look at your single-threaded program and automatically figure out how to parallelize it implicitly is fine and dandy, but those automatic transformation tools are limited and dont yield nearly the gains of explicit concurrency control that you code yourself. The mainstream state of the art revolves around lock-based programming, which is subtle and hazardous. We desperately need a higher-level programming model for concurrency than languages offer today; I'll have more to say about that soon.

If you havent done so already, now is the time to take a hard look at the design of your application, determine what operations are CPU-sensitive now or are likely to become so soon, and identify how those places could benefit from concurrency. Now is also the time for you and your team to grok concurrent programmings requirements, pitfalls, styles, and idioms.

A few rare classes of applications are naturally parallelizable, but most arent. Even when you know exactly where youre CPU-bound, you may well find it difficult to figure out how to parallelize those operations; all the most reason to start thinking about it now. Implicitly parallelizing compilers can help a little, but dont expect much; they cant do nearly as good a job of parallelizing your sequential program as you could do by turning it into an explicitly parallel and threaded version.

Thanks to continued cache growth and probably a few more incremental straight-line control flow optimizations, the free lunch will continue a little while longer; but starting today the buffet will only be serving that one entre and that one dessert. The filet mignon of throughput gains is still on the menu, but now it costs extraextra development effort, extra code complexity, and extra testing effort. The good news is that for many classes of applications the extra effort will be worthwhile, because concurrency will let them fully exploit the continuing exponential gains in processor throughput.

More:
The Free Lunch Is Over: A Fundamental Turn Toward ...