I have been contemplating going back to Soho to get my coffee supplies replenished, but this rare find in Sainsburys saved me the trek to central London. The taste is not as sour as I would like and it has not got the heavy taste I have become accustomed to. Anyway I would give it a 3/5.
18.10.10
A good coffee from back home...Zambia Terranova
17.10.10
A delightfully parallel problem
I am currently recruiting and part of this process involves the analysis of lots of CVs. This can be quite time consuming, so I have decided that I will expedite the process by developing a small text mining application that analyzes the resumes and produces a signature for each document. The most commons words in the CV will then be readout by the computer and if I like the sound of the digest I will shortlist the candidate.
The essential parts of text mining are being able to tokenize the document and filtering out noise words. Thanks to PLINQ, I can do the following:
- private static IEnumerable<IGrouping<string, string>> CalculateWordFrequency(string[] content)
- {
- var groupedWords =
- content.Select(word => word.ToLowerInvariant()).GroupBy(word => word.ToLowerInvariant()).Where(
- word => word.Count<string>() > 2);
- return groupedWords.AsParallel();
- }
The document tokenizer approach is fairly naive and could do with more work.
After we have grouped the words in each resume, we then use the Microsoft Speech API found in the namespace, System.Speech.Synthesis to recite the most common terms
- public static void ReciteResume()
- {
- using (var speechSynthesizer = new SpeechSynthesizer())
- foreach (var item in GetGroupedTermsInResume(FilterContent(TokenizeContent(ReadDocument()))))
- {
- speechSynthesizer.SetOutputToDefaultAudioDevice();
- speechSynthesizer.Speak(item.Key);
- }
- }
I have already hired a dozen plus developers using the traditional filtering approach, so it would be interesting to see how the results of the automated CV selection process compare.
An alternative method of representing the CV digest is to generate a logarithmic plot, “signature file” as shown below. I wonder what signature represents an ideal candidate. I would probably need to analyze large amounts of data to arrive at an empirically valid conclusion.
Incidentally this candidate was not hired as their CV contained a lot of buzzwords and they could not explain how they had used the technologies.
And this is what it sounds like [audio file].
9.10.10
You gotta love Design By Contract
- [TestMethod]
- [ExpectedException(typeof(TradingServiceException))]
- public void ShouldThrowExceptionForInvalidTrade()
- {
- var mockTrade = new Mock<AbstractTrade>(MockBehavior.Strict);
- mockTrade.SetupAllProperties();
- mockTrade.Setup(trade => trade.IsValid()).Returns(false);
- var tradeManager = new TradeManager(mockTrade.Object);
- tradeManager.Execute();
- mockTrade.Verify();
- }
And when the test fails, we get the following message:
TradeGeneratorTests.ShouldThrowExceptionForInvalidTrade : Failed
Test method Zainco.Commodities.Unit.Tests.TradeGeneratorTests.ShouldThrowExceptionForInvalidTrade threw exception:
Zainco.Commodities.Exceptions.TradingServiceException: Precondition failed: Trade.IsValid() == true Trade execution failed
at System.Diagnostics.Contracts.__ContractsRuntime.Requires<TException>(Boolean condition, String message, String conditionText) in :line 0
at Zainco.Commodities.TradingService.TradeManager.Execute() in TradeManager.cs: line 28
at Zainco.Commodities.Unit.Tests.TradeGeneratorTests.ShouldThrowExceptionForInvalidTrade() in TradeGeneratorTests.cs: line 36
Sweet!
CodeContracts break encapsulation and Resharper 5.0 is blissfully unaware of them
Seems I have to make my helper methods public if I want to use them within a code contract, but I avoid this by defining a property with the private setter to end up with the Contract.Requires<TException>(…) implementation below:
- public AbstractTrade Trade
- {
- private set { _trade = value; }
- get { return _trade; }
- }
- public void Execute()
- {
- Contract.Requires<TradingServiceException>(Trade.IsValid(),
- "Trade execution failed");
- if (TradeExecutedEvent != null)
- {
- TradeExecutedEvent(this, new TradeEventArgs(Trade));
- }
C:\Sandbox\Pricing\CommodityServer\TradingService\TradeManager.cs(24,13): error CC1038: Member 'Zainco.Commodities.TradingService.TradeManager.get_Trade' has less visibility than the enclosing method 'Zainco.Commodities.TradingService.TradeManager.Execute'.
C:\Sandbox\Pricing\CommodityServer\TradingService\TradeManager.cs(24,13): warning CC1036: Detected call to method 'Zainco.Commodities.Interfaces.AbstractTrade.IsValid' without [Pure] in contracts of method 'Zainco.Commodities.TradingService.TradeManager.Execute'.
elapsed time: 294.0169ms
8.10.10
Are all circular dependencies created equal?
I am using the separated interface pattern to implement a commodity trading engine for a bourse in the emerging markets. My unit test package is mocking one of the interfaces and consequently has a dependency on the interfaces package. Likewise the Service package depends on the interfaces package.
Superficially I appear to have a circular dependency and eager Resharper 5.0 complains about this with a fairly descriptive error message “Failed to reference module. Probably reference will produce circular dependencies between projects.”. The result is that intellisense breaks!
What to do?
Are all circular dependencies created equal? Does it matter that the offending dependency in this case is an abstraction rather than a concrete type?
On further examination, the only real dependency is between the test package and the trading service, the other dependencies essentially enforce the contracts between the interface package and those that must either implement the behaviours defined by this package or use the behaviours provided by these contracts.
Tools hmmm….
5.8.10
29.7.10
Refreshing Exceptions…
Ever faced the cryptic exceptions that leave you red-eyed from looking at the debug window and inspecting automatic variables? Well if you have spent most of your adult life making money from software development, the probability of this happening is fairly high. So when you encounter an exception message like the one below, you kind of know that the software/API has been developed by developers that care:
WorkOrderTests.ShouldSaveWorkOrderDocument : Failed
Test method WorMaSysUnitTests.WorkOrderTests.ShouldSaveWorkOrderDocument threw exception:
System.InvalidOperationException: The maximum number of requests (30) allowed for this session has been reached.
Raven limits the number of remote calls that a session is allowed to make as an early warning system. Sessions are expected to be short lived, and
Raven provides facilities like Load(string[] keys) to load multiple documents at once and batch saves.
You can increase the limit by setting DocumentConvention.MaxNumberOfRequestsPerSession or DocumentSession.MaxNumberOfRequestsPerSession, but it is
advisable that you'll look into reducing the number of remote calls first, since that will speed up your application signficantly and result in a
more responsive application.
at Raven.Client.Document.InMemoryDocumentSessionOperations.IncrementRequestCount()
at Raven.Client.Document.DocumentSession.SaveChanges()
at WorMaSysUnitTests.WorkOrderTests.ShouldSaveWorkOrderDocument() in WorkOrderTests.cs: line 58
Playing with the Raven
Zainco is currently evaluating NoSQL approaches for a groundbreaking application we are developing for a security client. At present Raven is looking very promising.
5.6.10
18.5.10
30 sprints later…
I have spent the best part of the last 1 year on a large project with many cross functional teams building a .NET application that integrates with a specialised vendor ERP system. As with any experience, you learn a great deal about yourself and others.
My greatest lesson is that there is a lot of crap software out there making lots of money and that good software (read this as open source) rarely, if ever, makes money.
With the crap software come the zealots and snake oil salesmen selling their bogus cures and panaceas .
If ever I had to express this in mathematical terms, I would say that:
The rate of return on investment on a software asset is inversely proportional to the quality of the code deployed.
This appears to tie with the empirical data, but as with any such observations, outliers or exceptions will exist.
I suppose our challenge as software craftsmen is to ensure that quality and ROI is balanced.