A Tale of Two Services – Links

Here are reference links from my presentation, “A Tale of Two Services”

NoSQL-related:

Windows Azure Table Service
Amazon Web Services SimpleDB
Oracle Berkely DB
Redis
MemcacheDB
Hibari
Project Voldemort
Riak
GT.M
Apache CouchDB
MongoDb
Raven DB
Google Bigtable
Apache Cassandra
Hypertable
Microsoft Dryad
Neo4j
“NoSQL Distilled” by Sadalage & Fowler

mongoDB – related:

mongodb.org
10gen.com
mongovue.com
MSDN March 2010
Hanselman’s Blog Post
Nielsen’s Blog Post
Mahugh’s Blog Post

Neo4j – related:

Neo4j.org
Using A Graph Database to Power the “Web of Things”
Cypher “Cheat Sheet”
Neo4j in a .NET World – Tatham Oddie

ASP.Net Web API – related:

For Visual Studio 2010: http://www.asp.net/web-api
Getting-started lessons: http://www.asp.net/web-api/videos/getting-started

Advertisement
Posted in Uncategorized | Comments Off on A Tale of Two Services – Links

The Most Common Problem(s) – Mark’s Rant

I can’t print!

I wish I had a dollar for every time I had to help a person with a printing problem.

I can’t find my file.

“I know I saved it somewhere, but I don’t know where and I can’t remember what the file was named. I think I was using Word.”

How do I do this?

“I don’t have time to read the instructions, read a book, or take a course. Just tell me everything you know that you spent years learning about this subject – in 5 minutes.”

Here’s the solution (instead of here’s the problem).

“We need a whole new system to…”

I need a new computer (aka – my computer is too slow).

“I need to have 15 applications running at the same time to listen to music, watch YouTube, Facebook, e-mail, chat, … Oh, plus I might need to run this application that is my actual job.”

The system is down.

The person next to me is not having any problems, but for some reason my machine doesn’t work, so the entire system is down.

This is not “User Friendly”

“I would like the application to do my job for me so that I can spend more time getting paid to talk with my friends.”

Posted in Uncategorized | Comments Off on The Most Common Problem(s) – Mark’s Rant

A Contact is a Contact (is a Contact…)

Contacts are at the heart of every business application that I have worked with. Whenever I start work on a new application for a client, the first thing I have to plan for is the migration of contact data from some legacy system into the new application. Wouldn’t it be nice if I could just use a service that I could use as a master contact management system? That way I could use the service instead of constantly having to re-invent a custom system or integrate with other systems that have their own contact management system built-in. When I use the word “contact”, I mean a person, company, or organization. When I work with contacts, I want to be able to see all of their basic information, such as names, phone numbers, email addresses, postal and street addresses, etc. I also want to see all the other information in my system associated with a contact, which may include documents, invoices, orders, agreements, appointments, and any correspondences. The correspondences may include email, letters, messages, and recordings. I think this is what anyone would want from a contact management system, but I am finding it hard to find a system like that as a desktop or web-based application.

What I envision is a list of contacts that can be organized into groups, much like the way you can organize them with Gmail Contacts, or Live Mail Contacts. However, these are email clients, so they are primarily designed to work with email (of course). It is interesting that you can’t display a contact and show all of the emails that were sent to and received from that contact. You can sort a list of emails by sent-to or received-from and even filter the results, but you can’t just show the contact with the related emails.

I can find document management systems that come close, but they have to be integrated with email and messaging systems. They also require more integration with customer relationship management systems, human resource management systems, and other enterprise systems. As it turns out, for every major business entity that I work with, contacts are the one entity type that ends up being related to every other major entity. For example, emails have email-to and email-from, orders have orders-to and orders-from, projects have people responsible for each task, etc.

Let’s see what can happen when we start with contact management.

Posted in InfoTrail, Uncategorized | Comments Off on A Contact is a Contact (is a Contact…)

Mark’s Manifesto

I have collected some baseline principles that guide my software development process.

The Requirements Principle:
Requirements are never  _____ (fill in the blank) enough.
Most clients are of the “I’ll know what I want when I see it.” variety. By the way, I am that way too. I don’t think it’s possible to fully specify any software project that is worth doing.
“Can you make it do this?” is another common question. My answer: “It’s just a matter time & money and the mythical man-month” (see the Pi Principle below).
“I want it to be user friendly.” – Of course, I always try to make it as un-user-friendly as possible.

Pi Principle:
Everything takes 3.14 … times as long to complete as what you originally estimate.
Corollary: Everything costs 3.14 … times as much as management wants to spend.
Pi has such a mystical quality about it, the way it goes on forever without repeating. It’s not unlike some projects that seem to never end.

The Golden Rule:
He (or She) who has the gold makes the rules.
But often times the rules are unclear and subject to change. The one who has the gold may not be right, but that person has the power to pucker up and stop a project as well as keep progress flowing. (You don’t have to have brains or a heart to be a boss, you just have to be an ….)

Commitment Principle:
It’s easy to say anything, but hard to do everything.
Corollary: Talk is cheap. Do what you say you will do (DWYSYWD).

Planning Principle:
A bad plan isn’t better than no plan at all, it’s just a bad plan.
Corollary: Success Principle: “If at first you don’t succeed, try, try again. Then quit. There’s no point is being a damn fool about it.” – W.C. Fields

Completion Principle:
The end is not in sight, except when death is looming.
Continuous improvement is good. Deadlines are good too. (Git ‘er done!)

Knowledge Principle:
You only know what you know.
But you can learn anything (and forget a lot of what you’ve learned).
Corollary: Nothing is easy, unless you know it.

The Code of Excellence:
Quality (not necessarily cleanliness) is next to Godliness.
Quality, the level of excellence, is often something that you can’t always define, but you know it when it moves you spiritually.

Posted in Uncategorized | Comments Off on Mark’s Manifesto

Adventures in NoSQL Land

I first got interested in NoSQL databases from a May, 2010 article in MSDN magazine. This article started me on a quest to learn more about the potential of using different databases other than relational databases like MS SQL Server, Oracle, or MySQL. Although I have worked mostly with MongoDB, I have been amazed at the growth that has occurred in the NoSQL community. Here is just some of the stuff I have discovered:

Wikipedia has an overview of NoSQL. It says that NoSQL could be called “Not only SQL” and that it differs from relational database management systems in that these data storage approaches may not require fixed table schemas, usually avoid join operations, and typically scale horizontally (by adding multiple servers rather than using more powerful servers). I was particularly drawn to the idea of no fixed table schemas because I have found that the total cost of making changes to database structures to be high, particularly for the type of business applications that I develop.

Roots of NoSQL

I think that NoSQL attempts to address (or accept and deal with) some of the data storage needs for large-scale distributed computing:

  • Distributed Computing – arose from the evolution of the Internet and the ability to create systems “in the cloud”. (I like “The Eight Fallacies of Distributed Computing” attributed to Peter Deutsch, 1994, Sun Microsystems:
    1. The network is reliable
    2. Latency is zero
    3. Bandwidth is infinite
    4. The network is secure
    5. Topology doesn’t change
    6. There is one administrator
    7. Transport cost is zero
    8. The network is homogeneous)
  • CAP theorem for distributed systems – Eric Brewer 2000  (Yahoo)
    – Consistency
    –Availability
    –Partition tolerance
    –(Pick any two)
Challenges
Some major challenges for data storage systems are:
  • Scalability
    • Decentralization – distributed systems
    • Flexibility – elasticity
    • Fault tolerance – how defective machines effect the system
    • Consistency – relative levels and how they affect the system
  • Caching
    • Frequency of reads and writes
    • Eventual consistency
  • Speed
    • Joins and relationships
    • De-normalization
  • Schemas
    • keys
    • indexes
    • views
    • stored procedures
    • views
    • etc.
  • Transactions (ACID)
    • Atomicity
    • Consistency
    • Isolation
    • Durability

NoSQL and relational databases address these challenges in different ways, so they have strengths and weaknesses associated with the design decisions.

NoSQL Databases

There are many database products that are called NoSQL. (I was surprised at how many there are and the number is increasing.) Here are some of them, by category:

I wish I could say that I have worked with all of these products, but I can’t. It has been fun fiddling with many of them, though.

NoSQL Means?

With all of these categories and products, what does “NoSQL” really mean? I have a few ideas:

  • No tables – objects, collections, nodes
  • No (or fewer) foreign keys and constraints
  • No ACID – can’t have it all
  • No sophisticated query planners: mostly REST
  • No declarative query language (more procedural)
  • More flexible, fluid designs (dynamic schemas)
  • More natural (and richer?) data representations
  • Highly scalable (horizontal scaling e.g., more machines, not bigger machines)
  • Sparse data – optional/multi-value fields
  • Large datasets (but small datasets too)
  • Meaningful identifiers
  • Access patterns (such as map-reduce)

Why Use NoSQL?

NoSQL has made inroads into applications when:

  • The scale-up of relational database cost is too high (when compared with NoSQL).
  • There are lots of temporary data that don’t need to be stored in a relational database.
  • There are complex queries with large datasets that need to be optimized.
  • Transactions don’t need to be very durable.
  • Object models considered to involve too many joins or have to be greatly de-normalized.
  • Large quantities of Large Objects (CLOB or BLOB) are stored in a database.
  • There is a need for fast data reads (but maybe not writes).

Considerations for Using NoSQL

Here are some things to consider, particularly when evaluating using a NoSQL database:

  • What is the problem that needs to be solved? (I know this one seems obvious. 🙂 )
  • Data storage growth requirements – scalability & Big Data
  • Data structure changes – potential shoehorned tables and queries
  • Object inter-dependencies and/or coupling
  • Cardinalities of relationships
  • Data access patterns
  • Application structure
  • Transactions
  • Single collection opportunities
  • Operating system(s)
  • Drivers – availability, support
  • File storage
  • Indexes
  • Map reduce/path transversal
  • Hybrid solution potential

Impact on InfoTrail

I have been experimenting with using MongoDB as the main data storage engine for InfoTrail modules. So far it has shown some significant benefits:

  • A collection-per-entity has reduced the number of tables to deal with. I am looking into potential to use a single collection for all entities that would benefit from caching and another single collection for all entities that are not cached. This could greatly reduce development time and cost as well as fit well into the software factory approach. As a relational database, basic InfoTrail has over 500 tables, and more tables are added for individual customizations.
  • Dynamic schema saves time in modification/enhancement of the modules. I would like to have the ability to have the user/admin add, update, or delete keys, values, and sub-collections from a system admin screen. This also would fit well with the need for change/version control.
  • Data retrieval rates appear to be faster (by at least an order of magnitude), but this will have to be benchmarked.
I haven’t found any reason not to use MongoDB yet. I have tried CouchDB, but so far I have preferred MongoDB. I also am working with Neo4j because it really works well with linking entities together.

Links

Here are some links that I found helpful so far:

NoSQLDatabase.org – a great overview site for all things NoSQL
NoSQLTapes – a collection video discussions and interviews by Tim Anglade

Posted in InfoTrail, NoSQL | Comments Off on Adventures in NoSQL Land

InfoTrail – Backgrounder Part2

(continued from InfoTrail -Backgrounder Part1)

Foundations of InfoTrail

InfoTrail modules are built on the shoulders of the works of many others, so let’s review some of the seminal works that are being used and evaluated:

Where to Start: Top Down, Bottom Up, Middle Out

Start with requirements? These should sound familiar:

  • “I’m a visual person. I’ll know what I want when I see it.”
  • “Give me some alternatives and I’ll pick the one I like best.”
  • “Give me an estimate for the price and delivery for the project I just described to you.”
  • “Above all, we need the application to be user friendly…”

The U2 Syndrome

Business software development is still too much of an iterative process. We talk about reusable components, patterns, and levels of abstraction, but I still haven’t found what I’m looking for. When I want to build an application that has scheduling and calendars, I want to be able to go online and find an open-source (free or very low cost) bundle of code and documentation in the language(s) of my choice, much like I can find a 1/4-20 screw, an 8 foot 2×4 piece of wood, or a lawnmower at Home Depot. So since I can’t find what I’m looking for, I’m going to try to build it.

Let’s Get Started

Consider the common business entities that are listed in InfoTrail -Backgrounder Part1. Couldn’t there be some kind of standardization that would allow the “manufacture” of software that handles those fundamental areas?

After enough (whatever that means) of the specifications for a project were agreed upon, I used to think that it was best to start a project by developing the database first. This was because I wanted to base the application on the universal data models in the “The Data Model Resource Book, Volumes 1, 2, and 3”, by Len Silverston. I wanted to build-in the standardized, but flexible data structures from the beginning. I also found it very helpful to review the current data structures (legacy data) that were being used to assure that all the entities, elements, and data requirements were being met. Also, the data structure, to a great degree, dictated what could be done with the user interfaces. Getting the database structure correct at the outset saved a lot of time because changes to the data structure have ripple effects through the whole project.  This can translate to considerable time and money.

However, application stakeholders (my customers) found this approach very difficult to accept for a variety of reasons. Probably the primary reason is that the output, e.g., reports and user interfaces, is what most people deal with when using a system. So it makes sense to start with the user interface and report designs. Stakeholders like to see an application’s output from the outset of a project (if not before).

We want the system to be service-oriented, so we need to start with our services first, right? The middle tiers of the application will have a huge impact on the scalability, reliability, security, etc. of the application.

So what’s the best approach to start with?

Let’s look at where the various elements of the application can be connected into a unified whole (framework) so we can start development at all levels and make iterations continuously as the application is developed. These iterations are necessary until we can establish conventions and standards to follow. We are trying to use a software factory as a business model, so let’s make the design of the factory be the starting point. Also let’s decide to use metadata combined with for the various elements of the designs that our software machines use to create the application. The metadata can serve as a good starting place for the project.

Before we really get started, let’s do some research into NoSQL, RESTful services, jQuery, single page applications, and federated authentication.

Posted in InfoTrail | Comments Off on InfoTrail – Backgrounder Part2

InfoTrail – Backgrounder Part1

What is InfoTrail?

Trails End Systems’ product line, InfoTrail, is based on the concept of a software factory that creates software products and services from application modules that can be assembled, configured, and used for a variety of information technology solutions.

Decisions, decisions

Let’s look at some major elements of a modern application development project (in no particular order):

  • User Interfaces
    • Reports
    • Dashboards
    • Operations
    • Menus & navigation
    • Analysis (Business Intelligence)
    • User experience
    • Multimedia
    • Multiple targets (desktop, browser, desktop, phone, devices)
    • Multiple screen sizes
    • Graphic design
  • Data
    • Operational
    • Logs
    • Warehouse
    • Metadata
    • Indexes
    • Views
    • Stored procedures
    • Operational procedures
    • Legacy data
    • External data sources
    • Maintenance
  • Extract, transform, load (legacy data and data warehousing)
    • Tools
    • Validation
    • Logging
  • Services (Middle Tier)
    • Data access
    • Authentication
    • Navigation
    • Business rules
    • Communications
    • Integration
    • Service bus
  • Make vs. buy decisions
    • Product evaluations
    • Cost analysis
  • Development tools
    • Deployment
    • Version control
    • Testing
    • Debugging
  • Development methodologies
    • Waterfall
    • Prototyping
    • Incremental
    • Spiral
    • RAD
    • Extreme
  • Programming languages
    • Assembler
    • C
    • C++
    • C#
    • F#
    • Ruby
    • PHP
    • Java
    • Javascript
    • XSL
    • HTML
  • Project management
    • Tools
    • Measurements
  • Documentation
    • Help files
    • Code
    • Procedures
    • Promotional materials
  • Users
    • Requirements
    • Training
  • Legal and regulatory requirements
  • Target hardware infrastructure

 … and these aren’t all of the things that require major decisions that impact the success or failure of a software project. Probably the most important is staffing, which isn’t on the list above. Finding the “best” approach is a daunting process. There are many products, services, and techniques available and they are continuing to emerge and evolve with time. Keeping up with all that is readily available is a daunting process too.

(I want to blog about this stuff while my company works through the process of developing InfoTrail modules.)

The software factory approach reduces the complexity and cycle time of software development by assembling applications by using standardization (patterns, models, templates, and frameworks) , modularization, and code generation. This is just using automation techniques applied to software development. It is similar to what has happened to the production of injection molds. The old process involved hand-making drawings of the part to be molded, then hand-making drawings of the mold, giving the drawings to a mold maker who made the parts of the mold with a milling machine. The more modern approach is to design the part with a computer-aided design program. Prototypes are then made with a 3-D printer. When the design is finalized,  the part design file is then imported into a mold designing program. The mold design file is then loaded into a computer-controlled electrical discharge machining tool that makes the mold.

Software factories, like manufacturing facilities can be very specialized, or they can be general-purpose like a custom manufacturer to simply assemble pre-made components. 

So what’s a module?

At the lowest level, a software module could be the software equivalent of a nut or bolt in manufacturing. This could be a single file (dll, text, .exe). But at the highest level it could be the software equivalent of an entire manufacturing facility with all the tools, people, machines. This could be a group of web sites, web services, applications, and systems. The key is that the modules must be designed to connect (communicate) with other modules so that they can become components of even bigger systems.

The purpose of the InfoTrail product line is to allow the assembly of personal or business information systems from modules that manage data associated with common entities. InfoTrail modules encapsulate a basic business or personal entity including user interfaces, middle tier services and components, and data storage. They are designed to be discoverable, configurable, extensible, and scalable. They also are intended to be self- documenting in a way that is discoverable, configurable, extensible, and scalable because of the software factory methods and components that are used to manufacture the code.

Here is a list of major entities that make up the foundation of InfoTrail (in alphabetical order):

  • Account (Transactions)
  • Agreement
  • Budget
  • Calendar (Events)
  • Campaign
  • Claim
  • Contact
  • Document
  • Facility
  • Location
  • Navigation (Menus)
  • Order
  • Product
  • Quote (Requests for Quote)
  • Requirement
  • Rule
  • Shipment
  • Task
  • Report

User interface shells provide a view into which the various user interface components can be placed. They are designed to be specific to the target device.

Posted in InfoTrail | Comments Off on InfoTrail – Backgrounder Part1

PDC 2010 – my take

I was only able to attend the viewing event at Microsoft’s Alpharetta, GA office on 10/28/2010. I watched the keynote presentations.

Here are the key items I got out of the keynote:

  • Emphasis on IE9 beta with HTML5 and CSS3 in my opinion was interesting, but since it is still in development, it’s still hype. The nice thing is that they are getting behind HTML5 in a big way which will help cross-browser development move forward. I understand the concerns about it detracting from (or eliminating) Silverlight, but I have a wait and see attitude about all that. In the meantime, I am going to go full steam ahead on both.
  • There’s a new oData client library out for Windows Phone 7.  New profiling tools are coming soon. I have been trying out oData for an InfoTrail contact manager application for WP7 – so far, so good. I’ll get into more detail on that in another post.
  • There will be an Azure virtual machine role available for Windows Server by year end. That should allow an easier migration path to the Cloud for many legacy applications. It will include Remote Desktop, full IIS, elevated privileges, and multiple administrators. Gee – just like a “real” server!
  • Team Foundation Server is coming to Azure. This is very interesting to me because it opens up more opportunities to collaborate anywhere in the world.
  • Azure Marketplace is being opened up to allow buying and selling of data.

I found a more detailed (and frankly, better) overview here.

The only problem I have with this information is that most of it was things that are coming soon, not things that are available now. But that is normally what is presented in this kind of keynote. I always get a sense of vaporware from this.

Posted in General | Tagged , | Comments Off on PDC 2010 – my take

.NET User Group Notes 10/25/2010

I was impressed with the presentation made by Ritesh Khotari at the Atlanta .NET User Group on 10/25/2010. The topic was on WCF 4.0 discovery and routing services. I have been working with WCF services (and RIA services), but I hadn’t learned much of the scope of what WCF really can do. In a way, it was so easy to set up a web service project in Visual Studio to send and receive data that I didn’t take the time to learn what was available for the bigger picture of SOA architecture using WCF. Ritesh’s presentation definitely changed that for me.

WCF Discovery

Ritesh walked us through a simple printer example that demonstrated the use of discovery. He created two simple projects, one (client) that sent a line of text and another (“printer” service) that received the text message and displayed it. He then made the service “discoverable” so the client could view if it was available. Then he created copies of the service with different end points to show multiple “printers” available. By turning the services off or on, the client app could show the user which “printers” were in service or off-line. Each instance of the “printer” service had a different endpoint and broadcast its availability. This reminded me of what MEF and PRISM can do for application modules and components. The loose coupling of services to client applications or other services is pretty easy to do with WCF and really can help with scalability, flexibility, and maintainability. Ritesh added discovery to the client and service projects mostly via additions to the web.config files, but you can also do the same thing in code.

WCF Routing

Ritesh then went on to discuss WCF routing. When the number of services in a system gets large, it becomes difficult to maintain because the services can be moved from device to device, added, deleted, or addresses can change. A good approach to mitigate this problem is to use a service router that serves as a centralized place to control traffic to available services. This is one of many approaches, but Ritesh made a compelling argument for using WCF routing by showing another simple example. He created a routing service project and showed how to make changes to service endpoints by making changes to the web.config. Then he showed how to add rules to the routing service, again in the web.config, that would allow routing according to specified conditions being met. His example showed how different incoming messages to different services depending on who was sending them or where the message was sent from.

Here are some of the key points I got out of the presentation:

  • WCF has facilities beyond just sending and receiving messages between applications and/or services
  • WCF discovery allows loose coupling of applications with services that can allow better SOA architectures to be implemented.
  • WCF routing is a great way to create SOA implementations that scale and are easier to maintain as the number of applications, services, and devices increase.

Use WCF discovery and routing to make your SOA system better in many ways.

When I get a link to download Ritesh’s code, I’ll add it to this post.

Thanks, Ritesh!

Posted in WCF | Tagged , , | Leave a comment

Windows Phone 7 Sessions

During the past few months I attended some live seminars hosted by Microsoft. Each of them have covered the basics of developing software for Windows Phone 7. The most recent was the two day event held here in Atlanta (Oct. 21, 2010 & Oct.22, 2010. It was an expanded version of the one day Firestarter event that I attended last month. The key presenters were Joe Healy and Glen Gordon.
Here are some of the key things about WP7 that I got out of the sessions:

  • I am looking forward to November 8th when I can buy a real phone to work with. It has been a long wait since I attended my fist seminar on WP7 and I would like to break out from only using the emulator.
  • WP7 may be promoted as primarily a consumer-oriented product , but it can be used as a general-purpose user interface for business applications. The only problem is there is no easy method for deployment to business users except for the app store, which opens up issues with security. The hope is that there will soon be something like private app stores that could be used by businesses. In the meantime, there are phone versions of outlook and office included.
  • Application development for WP7 is much the same as with any other Silverlight project. However, data persistence must be handled to avoid data loss when the user exits from the app, or when the app is shut down by the operating system. Fortunately, there are events that can be used to trigger saving of user data.
  • You can also develop games using XNA. I haven’t tried this yet, but it looks like fun. My experience with making Flash games showed me that creating all the graphics, sprites, and sound effects takes the most time, but once that work is done you can do a lot of great things. You can even build 3D applications for WP7 and the rendering speed and flexibility is impressive.

So I was inspired to try my hand at writing an app for WP7 that was more that just a “Hello World”. The application I am making is a contact management system that uses the same database as my Silverlight and Lightswitch applications. More on that in upcoming posts.

Since I had already installed the beta version of the WP7 tools, I had to uninstall them before installing the latest version. That took a long time.  I think the whole process lasted a couple hours. I was able to take a nice nap during the final install.

I got the WP7 tools here…

There is also a Windows Phone Training Kit for Developers here…
That’s all for now. Back to programming!

Posted in Windows Phone | Comments Off on Windows Phone 7 Sessions