Dr. Volkan Tunalı's Personal Blog

Computer, Technology, Science, Art

Archive for the ‘Software Development’ Category

NoSQL Revolution (and Evolution)

leave a comment

Revolutions are endemic to tech culture. A new group comes along and wonders why the last generation built something so complex, and they set out to tear down the old institutions. After a bit, they begin to realize why all of the old institutions were so complex, and they start implementing the features once again.

We’re seeing this in the NoSQL world, as some of the projects start adding back things that look like transactions, schemas, and standards. This is the nature of progress. We tear things down only to build them back again. NoSQL is finished with the first phase of the revolution and now it’s time for the second one.

I’ve read an interesting article about NoSQL databases at infoworld. The article explains how the revolution of NoSQL databases have emerged and what the current situation from the perspective of solution development is. I think the article touches on important issues where NoSQL databases have inherent weaknesses besides their advantages, and every solution architect or developer must be aware of those issues.

To read the full article, please follow this link: 7 hard truths about the NoSQL revolution.

Written by Volkan TUNALI

July 16th, 2012 at 4:34 pm

Cryptography Classes for .NET Compact Framework

3 comments

You may know that System.Security.Cryptography for Compact Framework lacks many cryptography algorithms compared to the desktop .Net Framework (2005 and later). In a project we have needed SHA512 encryption on Windows CE and we have found /cfAes library which provides almost all of the crypto functionality of .NET Framework. We are grateful to the author for sharing the class library.

The following table displays a comparison between the versions of .Net Framework with respect to the support for different crypyography algorithms (X means that it is supported, 0 means partially supported).

Crytpo Algorithm .NET 2003 OpenNETCF 1.2 /cfAES WSE 2.0 CF 2.0 .NET 2005
MD5 X X X X X
SHA1 X X X X X
SHA256 X X X
SHA384 X X X
SHA512 X X X
MACTripleDES X 0 X X
HMACSHA1 X X X X X
PasswordDeriveBytes X 0 X X
RC2 X X X X X
DES X X X X X
TripleDES X X X X X X
Rijndeal X X X X
RSA X X X X X X
DSA X X X X X
RIPEMD160 X X
HMACMD5 X X
HMACSHA256 X X
HMACSHA384 X X
HMACSHA512 X X
HMACRIPEMD160 X X
Rfc2898DeriveBytes X X
ProtectedData X X
ProtectedMemory X X
PSHA1 X X
AESKeyExchangeFormatter X X
TripleDesKeyExchangeFormatter X X
SecureString X X



You can download the source code of the library here.

Written by Volkan TUNALI

November 10th, 2010 at 11:56 am

Storage Cost over Years

leave a comment

Every time I buy a new PC, either desktop or notebook, its hard disk capacity is larger than the previous one even though the total price of the PC is about equal. The same thing may apply for the other components of the PC like main memory capacity and CPU power, but hard disk capacity is something very different.

Matthew Komorowski has collected hard drive capacity/price data and created the graph below:

Storage cost over years
Source: http://www.mkomo.com/cost-per-gigabyte

Komorowski has also drawn a conclusion about the capacity/cost trend as:

Over the last 30 years, space per unit cost has doubled roughly every 14 months (increasing by an order of magnitude every 48 months)

Moreover, he has a formula for the cost as

cost = 10-0.2502(year-1980)+6.304

Below are two pictures from computer magazines of 80s. Incredible!

Written by Volkan TUNALI

September 18th, 2010 at 12:24 pm

Short Movie: Java 4-ever :)

leave a comment

As a software development professional, I am never a fanatic of one platform or one tool. The choice always depends on many factors and constraints. The most important thing, I think, is not the tools you use to solve the problem of the customer. Customers usually do not know about them at all. Customers usually expect good, effective and timely solutions.

I’ve found a short film about Java vs. MS .Net. It’s very funny. :)

If you can’t see the video you can visit http://www.youtube.com/watch?v=Oo-cIGVaOYE

Written by Volkan TUNALI

July 27th, 2010 at 4:56 pm

Turkish Deasciification

leave a comment

Deasciification is the process of converting text written with only ASCII letters to its correct form using corresponding letters in Turkish alphabet (or any language that contains non-ascii letters). For example, the text “Cok yogun bir calisma ve emegin urunu” conveys the meaning, that is, human intelligence is able resolve ambiguities (if any) and understand text like this. The text, however, should be written as “Çok yoğun bir çalışma ve emeğin ürünü” (in Turkish). This is what a deasciifier is supposed to do.

Well, why do we need deasciification? We may not have Turkish letters on the keyboard (or the OS we are using may be without Turkish keyboard layout) and we need to end up with a text in correct Turkish form. It is also possible that we are accustomed to typing only with Ascii letters for some reason.

In addition, we may need to analyze a large collection of Turkish documents, and this collection can be contaminated with text written in Ascii, which will degrade the performance of our analysis. Then, the only possibility is to use deasciification. This is the most important reason for me as I often perform text mining on Turkish document collections, and I always need deasciification.

In this post, I’ll shortly review a few deasciification tools developed with several languages.

The first deasciifier is the one which is part of Zemberek project. Written completely in Java, Zemberek is an open-source general purpose Natural Language Processing library and toolset designed for Turkic languages, especially Turkish. A web-based demo of Zemberek is available at http://zemberek-web.appspot.com/. I usually use the deasciifier of Zemberek in my text mining research when I work with Turkish text datasets.

The next deasciifier is developed by Gökhan Tür at Sabancı University. More information and a demo is available at http://www.hlst.sabanciuniv.edu/TL/deascii.html. This system is currently not open-source, and not available for download.

One deasciifier is from Deniz Yüret at Koç University, which is actually developed for Emacs for realtime correction of words written in ascii form. More information and download is available at http://denizyuret.blogspot.com/2006/11/emacs-turkish-mode.html.

Yüret’s deasciifier is converted to Javascript by Mustafa Emre Acer, and available at http://turkce-karakter.appspot.com/.

The last deasciifier, recently published by Emre Sevinç, is a conversion of Yüret’s work into Python. More information and download is available at http://github.com/emres/turkish-deasciifier.

None of these deasciifiers is perfect, but they all perform pretty well for most of the situations. I’m sure we’ll see much improved deasciifiers with the advances in NLP studies for Turkish.

Written by Volkan TUNALI

July 23rd, 2010 at 7:52 pm