First of all, don't let this unfamiliar word scare you. If you are a mathematician and familiar with "category theory" you may have encountered this word and concept before, but if so, you are one out of a million people. I mention this, not because I have any experience with "category theory", but just so you know where this odd word "monad" comes from.
Many people understand and use monads, and so can you.
What is most important to learn about monads is that they are all about a certain sort of function composition. (I.e. chaing of operations.) Some special operators have been developed (three of them by my count) that provide the glue that makes all this work, but we are starting to get ahead of ourselves.
In other words monads are about plumbing. One person called a monad a "type disciplined pipeline". So the monad is about the ability to chain functions together, not about what is being chained together.
Forget about IO. Monads are not about IO. Tackling Monads and IO at the same time will just confuse you. Indeed IO is a monad, but the whole pure/impure world issue is independent of (orthogonal to) the Monad business, and tackling both of these at the same time will just confuse you. Learn monads, then go back to IO later.
It seems to be true that monads deal with "fancy values". So we have a value with some added side property, and this business of monad chaining disciplines the way this side property gets propogated.
Most importantly, "Monad" is a type class. If this is new to you, you should go study that and get a clear understanding before trying to digest anything further. Many Haskell data types belong to the Monad type class, and it may not be necessary or helpful in most cases to worry about them being Monads. Lists are an example.
For some time I thought monads were all about IO, but this is most definitely not the case. Monads are indeed used to handle IO in Haskell, but you can (and should) get a handle on monads without even thinking about Haskell IO.
Haskell "do notation" is all about making this monad chaining business more pretty. Since we are talking about a sort of pipeline, the fact that "do notation" yields a sequencing of operations should not be a surprise.
What a type class does is to specify certain behaviors and operators that a data type must support. There is an Eq type class, and for the == and != operators to make sense, a data type must belong to this type class.
In general, Monads "wrap" or augment a basic data type with a certain "smell" or character. A list Monad for example can have a plurality of values. The "Maybe" Monad augments a basic data type with the possibility of failure. The key aspect in dealing with Monads is to maintain and propagate that augmented character in a way that is elegant and convenient for the programmer.
To understand any type class, the heart of the issue is to understand the operators that the class supports. Here they are for Monads. There are just three that you need to worry about.
Tom's Computer Info / tom@mmto.org