I recently assisted in a review of data modeling tools for a customer.  I won't mention which one we chose, but I would like to give a general review of the tools.  The 3 major tools on the market for data modeling are CA Allfusion Data Modeler, Sybase PowerDesigner, and Embarcadero ER/Studio, and we looked at these 3 competitively.  Each has their own pros and cons. Some provide better SQL Server 2005 support than others.  Some are just plain easier to use than others.  I won't go into a diatribe about data modeling, but if you are looking for a new tool, or if you are looking to begin data modeling for your development projects, you might want to read on to see which tool might be a good fit for you.

CA AllFusion Data Modeler R7

This is the tool formerly known as ERWIN. If you've been in data management for very long, you've probably seen this tool at least once before, most likely as shelf-ware along with many other data management products.  That's not to say ERWIN isn't a fine tool, just that it is underutilized.  ERWIN has gone through a beleaguered history bouncing from LogicWorks to Platinum and then being consumed by Computer Associates.  It was really the first of the PC-based graphical data modeling CASE tools.  ERWIN, or I should say AllFusion Data Modeler as CA now calls it, is in it's 5th major version, but it's being called R7 (mostly I think to keep up with the Joneses - see versions of the other products being reviewed).  The latest version provides additional support for DBMS platforms other than SQL Server and it has little (if any) support for SQL Server 2005 new features. ERWIN's strengths continue to lie in green-field development.  It is very fast at creating a new logical data model, converting that to a physical data model, and generating the schema to create the database objects.  ERWIN's weaknesses continue to lie in metadata management, reporting, graphical layout of the model, and creating ALTER TABLE schema when comparing to an existing database.  The latter is the reason it is difficult to use for anything other than green-field development.  Both of the other tools provide much better means for developing enterprise databases for the long-haul.

Pros

Easy to create new structures
Long History
Full-Featured logical modeler
New Schema DDL Generation

Cons

Metadata Management
Poor Support from helpdesk or website
Graphical Layout - one method, not easy to read
CASE tool to generate ALTER DDL
Lack of SQL 2005 support
Price - it was the most expensive of the 3 solutions

Sybase PowerDesigner 12

Sybase has put a lot of effort into designing a comprehensive modeling suite. Their enterprise edition modeling tools allow for collaborative modeling through all phases of the SDLC.  The ideal scenario for PowerDesigner is to start by capturing Use Cases and modeling the results in PD.  The Use Case Model then pushes down to a number of sub-models, including a Conceptual Data Model.  Data Items can be defined by the business analyst in the Use Case, and the data modeler then adds the Data Item to the appropriate entity.  This is an incredibly empowering scenario for the business, as IT is no longer the only one with data definition capability and responsibility.  It also takes a lot of work off the data modelers' plate as now they have less work to do defining the nitty-gritty of each attribute. 

Once a Conceptual Model is complete, it can be pushed down to a Logical Model for further definition.  A complete Logical Model can be pushed further down to a physical model based on a specific DBMS.  All of these models are decoupled, in that changes to one do not affect the other(s).  PD does provide a link and synch tool to push changes up and down the model chain, allowing you to model at any level, while still keeping all the levels in synch. 

PD is an excellent end-to-end, formal modeling solution providing a collaboration repository.  Its strengths are many. The rich feature set allows the modeler to do just about anything you'd ever want to do.  Besides data modeling, the CASE tools provide Object Modeling and code generation in virtually all the major languages, making it a great place to start any project.  The downsides are just as you might expect.  The capabilities of this tool bring much complexity.  Ripping out a simple data model for a new project is not all that easy.  The tool really pushes you down a process based development pattern.  Also, the interface is not all that intuitive. I have been data modeling for many years, and it took me a  significant amount of time to be able to create entities and relationships for 20 entities.  Lastly it struggled much like ERWIN in creating ALTER DDL.  It was quick good at creating new DDL, but struggled when comparing a model to an existing database.

Pros

Complete business proccess and IT modeling tool
Rich Collaboration
Exceptional Metadata Management (Blobs and Definitions and search capability)
Flexibility in configuration
OK SQL 2005 support

Cons
Complexity
User Interface
Comparison and Script Generation

Embarcadero ER/Studio 7.1
I believe ER/Studio is the youngest of these modeling tools, but it certainly can hold its own with the older brothers.  ER/Studio provides rich data modeling capabilities, metadata management, reporting, and automation.  The last point is a feature I really like, the macros.  Users can write their own productivity and standard enforcement scripts to change models in whatever way necessary.  Embarcadero opens up the object model for access to program your own, but they also provide many helpful macros in the box.

Like PD, Embarcadero decouples the physical model from the logical model, providing compare and synch capabilities between models and the target database.  This last point makes ER/Studio stand-out.  It does a great job of creating alter DDL when comparing to an existing database.  Many tools will rename the table, recreate the table, and re-insert the data for even the simplest changes.  ER/Studio does a pretty good job of knowing when you just need an ALTER TABLE ADD COLUMN script.

The user interface and model layout features of ER/Studio are beyond belief.  There are six ways to layout a model. All of them are meanigful and fast, even with large models.  There are some areas of the user interface that take longer to do things than say ERWIN or PD, but in general it is intuitive and easy to use.

Lastly, ER/Studio now comes in a just for SQL Server edition, which is much less expensive than the competitors, which charge for all the databases, even if you never plan to use them.

Pros

Great Model Layout
Automation Macros
Good Comparison tool and DDL Generation
OK Metadata management
Great Helpdesk and feature enhancement support
SQL 2005 support
Price (SQL Server Edition)

Cons

Some tedious areas of the interface
A little buggy (needed to call support a few times)

As you can see, all of these tools can provide an organization with good data management abilities.  One thing I didn't mention in the reviews is that data management is more than implementing a tool.  It takes discipline and process enforcement.  None of these tools will do that for you. That's the people part of this industry.  Hopefully, this review will help others as they look to choose a tool to implement in their SDLC.

Jon

with 0 Comments

There is a new article by Chris Shaw on SQLServerCentral.com about "Hiring a DBA".  I want to start off by saying that Mr. Shaw is right on, and thanks to him for writing this!  Having just gone through a hiring process, which was like many I've had in the past, I feel his pain. 

Hiring an employee or a consultant is always an adventure.  As Chris put it, it's more art than science.  Sometimes you just know one way or the other.  Sometimes, you are just wrong!  I recall one instance where the candidate interviewed with me and another Sr. DBA.  The guy answered many technical questions correctly, said "I don't Know" in the right places, and seemed to communicate very well.  However, when he arrived on site, he sadly did not work out due to lack of initiative and creativity and lack of ability to work in a highly available environment.  We were just wrong!

Hiring is typically a time-based task.  Very rarely can you wait around for the absolutely perfect candidate to show up on your doorstep.  I liken it to buying a house.  You can't buy the best house in the world (well, most of us can't anyway).  You buy the best house on the market when you are looking, where you are looking, and in the price range you can afford.  You recognize it might need some fixing up or doesn't have all the ameneties you were looking for, but that's OK.  You might add/fix it later or live without it.  As long as the candidate has the basics and the personality matches, that's often enough.

In Chris' article he breaks down the process into a few stages, including position definition, position posting, resume review, and interviewing.  I really liked the first two sections, and I have things I'd like to add to the last two.

Resume review is something that comes with a good bit of experience.  After talking with a number of candidates after reading their resume, I've learned to spot quite a few things, now.  Some of my favorites are

  • Search and replace of Oracle and/or DB2 with SQL Server.  This one kills me.  It's obvious on the resume when a candidate doesn't really know SQL Server.  This last round we had a candidate that had the title of SQL Server DBA, but all the tools listed were Oracle (TKPROF, OEM, etc.).  I realize that Oracle has a larger market share than Microsoft, so it's likely the person did both, but the resume didn't read that way at all.
  • Developers trying to become DBAs.  Yes, there are 2 types of DBA; Production Support and Application Development.  I am all for people expanding skillsets and working into new positions.  I just wish they were upfront about it.  I see alot of applications developers applying for Production Support roles, and the resume reads "developed SQL and stored procedures for .NET application", or "designed tables to support data storage."  When this type of candidate is asked about fragmentation or database consistency or backups, s/he really doesn't know.  It's just too far of a stretch.  That's an easy one to see on a resume, though.
  • Lots of job/contract hopping.  As Chris points out long tenure is easy to spot on a resume.  So is job hopping.  While I understand that "long tenure" has a different definition today than it did 20 years ago, it does seem reasonable that a person would catch on with a company for a couple of years at least.  There are exceptions to this, such as project related consultants.  Typically, that type of person is easy to spot, too, due to the high level of project tasks and variety of technologies used.  When I spot these types of people looking for long-term gigs, I like to phone interview and find out why.  Sometimes the reasons are legit and the person will work out.  Other times, you may be just the next hop.

Interviewing is also something that comes with a good bit of experience.  Many people think that answering technical questions and being funny or nice is all you need to get a job.  Actually, I've recommended people for hire without asking a single technical question, and been right!  More than knowing book answers or being able to quote a popular website or magazine, I like to learn about a candidate.

I prefer a two stage interview. First the phone interview, then the face-to-face.  The phone interview is crucial.  Unless a candidate comes recommended by someone I respect, I do a phone interview, and sometimes even then.  I like to ask a person to tell me a bit about him or herself and what recent projects s/he has done.  I can quickly tell about communication ability and enthusiasm or passion from these questions.  I might lob a few softball technical questions at the person on the phone, maybe more than softballs if s/he is long distance and would need to travel for a face-to-face.  BTW, pay attention to the sounds on the other end of the phone.  If you hear mouse clicking or keys tapping, the person may be googling or looking in BOL for answers (ask me how I know!).

Next is the face-to-face, and I hope you are wearing anti-perspirant for it.  It could get messy.  I prefer a group interview with some senior people, possibly even developers.  I try to push the person until I learn what I need to know to recommend or hire the person.  I'm really looking for the following:

  • How does a person handle stress?  I'll use questions or scenarios to get the person to a point of stress to see how they handle it.  My friend Scott Sawatzki likes to say people handle stress in 3 ways; Fight, Flight, or Freeze.  I avoid DBAs that freeze.  There is a lot of stress in the DBA role, typically.  If you've ever had a cubical full of managers waiting for you to restore a database or restart a service, you know what I mean.  How a person will behave in this type of situation is crucial to his/her success.
  • How does a person handle lack of knowledge.  Like Mr. Shaw, I like to hear people say "I don't Know", or "I would look that up".  Think of an interview as an adaptive test.  It will get harder as the candidate answers hard questions, and it won't stop until "I don't know" is reached. 
  • How does a person solve problems?  Does s/he work them out silently or verbally, or not at all?  If given a scenario question where a long answer with multiple steps are required, what does the person do?  There is so much problem solving for DBAs, it just requires a good problem solver.  Whether it is troubleshooting or SQL development, a candidate must be able to solve a problem on his/her own.  I tend to get a hint of organizational skills here, too.
  • How much information does a person offer when answering questions?  Does the person give short, terse answers, or does s/he prattle on and on getting off topic completely.  Like Chris Shaw mentions, this is typically a sign the candidate doesn't know and is trying to divert.  I like answers to be somewhere in the middle of these 2 extremes.  I recall a hire where I did not recommend the candidate, but he was brought in anyway.  I didn't like the fact that he talked too much in the interview.  In the end it was discovered the guy really didn't have the basic understanding needed and spent most of his day socializing.  He was let go in under 6 months.

There are many discussions about the types of questions to ask in an interview.  Given there are a number of sites that already list questions, I won't add to that.  I think this is a good starting point. http://vyaskn.tripod.com/iq.htm.  Vyas has provided questions by category, which is helpful when you might be interviewing for different roles.  Obviously, I have my own set of questions, but a candidate should at least be able to answer a set like these that are readily available. Essentially, I like to ask questions that are representative of what the person will be doing.  I like Chris Shaw's idea of making a list of tasks and projects.  That will narrow down the types of questions to ask.  I also like scenario questions where the person must give more than one word answers to demonstrate technical knowledge.

Hiring is hard.  It bogs down a staff trying to do it, and unsettles the team for a month or so afterward.  That's why its important to do it as infrequently as possible by getting the right candidate the first time.

What are your hiring or interviewing experiences?  What do you think of my additions to Chris' aritcle?

Jon

 

with 1 Comments

Jeffrey Yao has written an article on SQL Server Central about how to be a good DBA.  Essentially, it is a roadmap for a developer or junior DBA to become the Sr. DBA.  While this is very high-level, it has some good points.  Since I'm often asked this question, I would feel fairly comfortable referring people to this article as a starting point. 

I do think Jeffrey missed some things, or at least didn't draw them out as plainly as I would have, so I thought, hey, why not add another entry to the blogsphere.

First, I think it is imperative to know set theory and relational theory.  Reading Dr. Ted Codd, Chris Date, Joe Celko, and Itzik Ben-Gan would be a good idea.  Often developers don't understand set theory and attempt to write data access logic using procedural constructs (loops, cursors, row-by-row processing). Yao hints at knowing T-SQL and data modeling well, which is good, but I think it is deeper than that.  T-SQL is an implementation (and not a very bad one) of SQL, which is an implementation of set theory.  Knowing the foundation is more important than knowing the implementation my opinion.

Second, I think Jeffrey glosses overy the importance of Information Delivery or Business Intelligence as a Sr. DBA.  This is all buried in the last point under Sr. DBA, but moving from relational structures to multi-dimensional structures that mean something to business users is a completely different way of thinking.  Being able to understand the business well enough to gather the requirements and communicate a proposed solution is a soft skill that Jeffrey mentions, but the technical side of that seems to be lost somewhere in the article in my opinion. 

I also think that Jeffrey missed a direction from which people migrate to DBA work.  I myself started as a programmer, moved to infrastructure, then settled in what I call the middle as a DBA.  I see people coming from both directions into the DBA role, and sometimes they lack the understanding from the other direction.  A network engineer coming to the DBA should learn some things about how to develop middle-tier and front-end code, just as a developer should learn the OS, network, and other infrastructure pieces.

On top of just learning how to program in a particular language or set of languages, I think an intermediate to Sr. DBA should work to understand the applications that use the data.  Not just the business, not just the language, but the architecture and implementation.  These will inherantly have quirks due to trade-offs or bad judgement.  The DBA should know these and learn how to work around them while not sacrificing the quality of the data.  The Sr. DBA should be part of the architecture team of an organization, helping to guide and direct the IT partnership with the business, including but not limited to how to fit new requests into the current architecture.

Lastly, I think it is important to stay current and up-to-date on skills and technologies.  This is hard to do, but it is very necessary.

Well, that's my $0.02.  Well done Jeffrey!  Thanks for writing this.

Jon

with 2 Comments

Awhile back I started at a new customer doing some project work and some production support work (sigh). One thing the customer requested was for a DBA to run a script on all production servers and review the output to make sure nothing was wrong, then email a summary to the other DBAs so they also knew nothing was wrong.  The interesting thing was that there was no clear cut set of criteria of what "wrong" was.  This all sounded rather automatable to me, so I set out to save myself and fellow DBAs some time and learn some things I had been meaning to learn, namely Reporting Services. 

In addition to saving time every day,  I also wanted to be able to track some stats that were collected over time, especially things like disk space remaining, DB size, etc.  The current monitoring solution employed by the customer didn't have this feature, so it would be a helpful addition.  I also wanted to nail down what the DBAs were looking for, so there was consistency in the monitoring, and I wanted to make it a push subscription instead of having to physically do something each day. Here's was I did.

First I created a new DBA database on each server to be monitored.  Then I turned pieces of the daily monitoring script the DBAs were using into stored procedures, but I altered them only to collect data that I didn't already have or didn't care about - no dups, no extra error messages, nothing I didn't want to see. BTW, this gave me a lot of cool scripts to use in other engagements, too. Next I scheduled these stored procs to run each day, collecting data to tables in my DBA database.

Next I created a repository for centralized reporting - a monitoring ODS of sorts.  The tables in the repository were identical in structure to the local server repositories.  I then created a DTS package to copy the data from the source server to the repository.  This would run each night sometime after the stored procs were executed on each server.

Finally, I created some reports.  The first reports were detail reports for things like errorlogs, backups, jobs, etc. The really important report was the summary report(see below), which the DBA team subscribes to as a daily email.  It basically provides a dash board for each production server. You can see at a glance where the problems are.  Oh, yeah, I also worked with the DBAs to determine what "wrong" meant, and what were acceptable thresholds for the things to be monitored. Each of the items drills down into the detail reports, so you can see specific data on each server.  So if you see something is "Not OK", you can find out why. (Note there's a lot of conditional formatting that you can't see unfortunately due to some limitations either of the posting tool, or my HTML abilities)

Daily Monitoring Summary
Server Name Backup Status Error Log Status Job Status Disk Status  
Server1 OK OK OK OK  
Server2 OK OK OK OK  
Server3 OK OK OK OK  
Server4 OK NOT OK OK OK  
Server5 OK OK OK OK  
Server6 OK OK OK OK  
Server7 OK OK OK OK  
Server8 OK NOT OK OK OK  
Server9 OK OK OK OK  
Server10 OK NOT OK OK OK  
Server11 OK NOT OK OK OK  
Server12 OK OK OK OK  

I thought this was pretty cool.  It has saved quite a bit of time.  We estimate it saves about 2-3 hours a week.  That's a good bit of savings.

I have lots of ideas of how to improve it including more parameterization and categories to check.  I've also thought about creating an online service where maybe people submit an XML document over SOAP with the details from each server to monitor, and then they can use the reports for their own environment.  Just a thought.

I'm sure I'm not the only one to do something like this. What have you done?

Jon

 

with 0 Comments

Among the people I work with, I'm known as a command line guy in a Windows world.  Yes, I use Start | Run to open cmd about 100 times a day, and I usually have a few open on my desktop.  But I never really took to ISQL or OSQL when working with SQL Server.  Oh, don't get me wrong, I use Query Analyzer 10:1 over SEM (some things are just faster in the GUI, as much as I hate to admit it). 

One tool in SQL Server 2005 may change me from the GUI completely (notice I said may), and that's SQLCMD.  I became familiar with the tool while working on the presentation material for the Detroit Launch Event.  Until then, it was but a bullet point on some marketing slides. 

In true Letterman fashion , here are my Top 10 Reasons why SQLCMD could become my new favorite tool. 

1. Extensive variablization.  Environment variables and declared variables can be used in various places in the script, including to change database (E.G. USE ($DB) ) and changing the server you are connected to (more on this next).

2. :CONNECT.  From within a single script, you can connect to multiple SQL Server instances and execute commands.  This allows for a great deal of automation possibilities, such as backup from one server, restore to another server.  Also, you could execute a configuration change script on all servers in production.  That's extremely powerful.

3. :!!.  This command allows you to execute an OS command on the client without having to connect and use xp_cmdshell.

4. SQL.INI.  This ini file allows you to set default parameters and even create a default query to execute on the server as soon as you connect.  This helps you to automate your connection properties and to some degree your environment in command line mode

5. SQLCMD Mode in SSMS.  Yeah, that's write, I'm mentioning the GUI in this post.  Essentially, if you build your SQLCMD script is SSMS, you have an IDE for development with all the benefits of command line.  It's the best of both worlds! You can set this in Tools | Options or click the button with the Red ! in SSMS (Not the Execute button, the other one).

6. Error Handling. :Error :On Error :EXIT with the -b and -v switches.  There's pretty much all you need in some combination of these to gracefully deal with run-time problems.  That is so much better than piping out and digging through log files! 

7. :PerfTrace.  This feature lets you put your statistics (IO, TIME, ShowPlan) out to a separate file for later viewing as so not to clutter the results file output.

8. :XML ON - Put all out put to XML format

9. Performance. SQLCMD connects with the .NET SQLCLIENT OLEDB Provider instead of ODBC, which is a much better performer

10. Remote Dedicated Admin Connection (DAC).  If you use the -A switch, you can get into a server that might otherwise be hung.  You can kill an offending process and restore order to the universe - without a reboot!

You'll notice that there are a lot of new commands, almost a sub-language for SQLCMD.  It might take a little while to learn these new features, but I think it will be well worth it.

So now Oracle DBAs can't kick SQL*Plus in our faces anymore.  We have a tool that easily competes and may be even better.  Have you looked at SQLCMD yet?  Are there other features of this tool you think are important? Tell me what you think.

Jon Baker

with 1 Comments