13 08

This kind of automated coding is called “machine programming.” One of its most interesting capabilities is “code semantic similarity,” which attempts to autonomously determine whether two code snippets show similar characteristics or achieve similar goals.

This has only recently become achievable due to advances in compute, access to “big code data” such as IBM/MIT’s new Project CodeNet which includes approximately 14 million code samples, and new machine learning algorithms.

By harnessing the power of code semantic similarity, the industry can develop automated systems to help CIOs ensure developer teams are maintaining the same level of productivity despite increased software and hardware complexity, all the while addressing the software developer talent shortage and combating burnout.

Code semantics similarity could also be used in tools that translate between programming languages (i.e., transpilers).

GitHub’s Co-Pilot, which I mentioned earlier, for example, is designed to learn what the intent of a piece of software is and then recommend improved (or more complete) versions to help the developer.

When fully realized, such code recommendation systems have the potential to raise the software quality and productivity of both novice and expert developers by providing them with improved alternatives.

Semantics similarity systems can also work in tandem with developers to autonomously detect errors in code.

Add your comment