July 6, 2017
Conceptual Correlation - Source Code + How To Build Your Own
Although it's only rough and ready, I've published the source code for my Conceptual Correlation calculator so you can get a feel for how it works and how you might implement your own in whichever language you're interested in.It's actually only about 100 lines of code (not including tests), and if I put my brain in gear, it could well be signiicantly less. It's a pretty simple process:
1. Parse the code (or the IL code, in this case) using a parse, compiler, decompiler - whatever will get you the names used in the code
2. Tokenize those code names into individual words (e.g., thisMethodName becomes "this" "method" "name"
3. Tokenize the contents of a requirements text file
4. Filter stop words (basically, noise - "the", "at", "we", "I" etc) from these sets of words. You can find freely available lists of stop words online for many languages
5. Lemmatize the word sets - meaning to boil down different inflections of the same word ("report", "reports", "reporting" to a single dictionary root)
6. Optionally - just for jolly - count the occurances of each word
7. Calculate what % of the set of code words are also contained in the requirements words
8. Output the results in a usable format (e.g., console)
No doubt someone will show us how it can be done in a single line of F#... ;)
Posted 3 years, 10 months ago on July 6, 2017
Navigation
Blogs I Read
Sections
Third-Generation Testing
Agile Development
Apes With Hobbies
Application Lifecycle Management
Apprenticeships
Architecture
Back To Basics
Bletchley Park
Boffoonery!
Books
Codemanship
Code Smells
Complexity
Continuous Inspection
Education
Events
In The News
Innovation
Legacy Code
Metrics
Microservices
Multithreading
Music By Programmers
Site News
Nonlinear Management
Podcast
Post-Agile
Products
Professionalism
Reality-driven Development
Refactoring
Reliable Software
Requirements
Small Teams
Software Craftsmanship
Software Process Improvement
Test-driven Development
UML
User Experience Design
Agile Development
Apes With Hobbies
Application Lifecycle Management
Apprenticeships
Architecture
Back To Basics
Bletchley Park
Boffoonery!
Books
Codemanship
Code Smells
Complexity
Continuous Inspection
Education
Events
In The News
Innovation
Legacy Code
Metrics
Microservices
Multithreading
Music By Programmers
Site News
Nonlinear Management
Podcast
Post-Agile
Products
Professionalism
Reality-driven Development
Refactoring
Reliable Software
Requirements
Small Teams
Software Craftsmanship
Software Process Improvement
Test-driven Development
UML
User Experience Design
Props: