Apart from the known challenges of scaling the backend of a chatting system or service, there are few problems that needs to be tackled to give users a smooth experience.
From the experience of implementing a scalable chat system, my team encountered few unusual circumstances that we didn't anticipate. Some of them listed below:
One of the biggest challenges is detecting a closed socket. Sometimes when an app (somehow browsers are more well behaved) exits (read killed), it fails to notify the chat server that the client has died; as a result - the server still thinks the socket socket is alive and hence you might find the user associated with the socket still showing online. One way to handle the situation would be to run a cron an identify which users has been inactive for some time. Send them a ping from the server and if the message sending fails, then remove them from online list. The other way which probably needs more hit handling capabilities in server is send a ping at certain intervals like 30 seconds and if a user hasn't sent ping in more than a 40 seconds, mark them as offline. Solutions can be multiple, but what's important is to know that sockets may not behave as expected when it comes to detecting disconnection.
The nightmare for any chatting system is slow connections which frequently drops packets. For starters, even if you start with some wonderful package like socket.io (probably that's the simplest to start with), you have to keep in mind that in future you may have to switch to raw websockets/own protocols or probably webrtc even for low latency solution. You may not go for XMPP protocol (which is pretty much the standard) considering resource shortages and learning curve; but as the project matures, it is possible to bring in other protocols when you gain further knowledge about the whole ecosystem and how to abstract and optimize.
One difficult challenge is to ensure user is properly getting unread messages in an "on and off " constantly dropping internet connection. If a user is offline, then comes back at later stage, he/she has to be provided the missed messages that were sent when user was offline. Usually that is done in a response to api call or through a socket when user reconnects. The issue that may arise is user may receive same message multiple times through live socket and then again from the API call after reconnecting to server - since reconnection may occur frequently in an "on and off" connection. The real challenge is in the client side because usually the message coming through live socket arrives in a different thread than the one which is a response to an API call. Hence the user ends up getting duplicate messages and duplicate checking in multiple threads can pose a challenge if not handed well.
A critical challenge is to update the status of messages i.e pending, delivered, seen etc. properly in the client end. A user can log in, see a message then disappear (read go offline) and the other party may log in at a later time to see the message status whether it was delivered to or seen by the disappeared user. Doing it in a an optimised way requires a lot of thinking. Even implementing XMPP has it's own challenges because it provides a subset of message status range by default and someone needs to know the protocol ins and out to tweak it.
We have just gone through few of the major challenges, that too in an 1-1 chatting system. Doing it for group chatting has its own challenges. However, if the 4 points mentioned above is handled well for a chatting system, the challenges posted by group chat implementation becomes a bit easier. So tackling these 4 well is a good start for any chatting system/service.