Cisco CDR Reporting and Analytics

 

Troubleshooting FAQ

NEW INSTALLS:

I set up the data input but when I log into Splunk the Cisco CDR app says there’s no data indexed.

See if the data is going into the wrong index. Click “Search” and run this search with “All Time” as the timerange :

| tstats count WHERE sourcetype="cucm_cdr" AND index=* BY index
| appendcols [| rest "/servicesNS/admin/cisco_cdr/admin/macros" splunk_server=local | search title="custom_index" | rex field="definition" "\s?index\s?=\s?\"?(?[^\"]+)\"?\s?$$" | fillnull index value="main" | rename index as macro_index | fields macro_index]
| eval Matches = if('macro_index' == 'index', "They Match", "They do NOT match")
| rename index AS "Index as found in data", macro_index AS "Index as defined in macro"
| table "Index as found in data", "Index as defined in macro", Matches, count

Output possibility 1 – OK!

The correct and desired result is output like this:

Index as found in data Index as defined in macro Matches count
cisco_cdr cisco_cdr They Match 320

In that case something really odd is going on.  Please go back to the app’s home page and let it load fully.  Check the upper right section after a minute or two and see if you now see data (blue bars) for the most recent day.

If you now see data – great!  You just must have clicked too fast after setup and it needed a little time to catch up.  🙂

If you still do not see data, please contact us and we can troubleshoot farther!

Output possibility 2

But you may also get something like this, with different values in the first two columns and the Matches column telling you that they don’t match:

Index as found in data Index as defined in macro Matches count
myCiscoCDRData cisco_cdr They do NOT match 320

If it says They do NOT match, this is the problem. Either change the macro to match the index being used (recommended) or the data input so that they match. If you choose to change the data input, restart that Splunk instance.

If you made any changes or corrections, wait for a few calls to happen and check again from the beginning of this troubleshooting section.

Output possibility 3

But you may also get something like this, with one field missing and the Matches column telling you that they don’t match:

Index as found in data Index as defined in macro Matches count
cisco_cdr They do NOT match

In this case, there is no data in any index  accessible to your user that matches the cisco CDR data’s sourcetype.

If this is the case, then proceed below with to continue checking the sourcetype.

Output possibility 4

Lastly, you might get neither set of results, which results in:

No results found.

If this happens then there’s no data in any index this user can access, AND our default macro we supply is not installed.  The most likely cause of this is that setup was never fully completed properly – double check the apps were installed as per our instructions, then start this troubleshooting doc over again at the top.

If the search finds no results, continue by checking the sourcetype.

We need to confirm that the data is tagged with the right sourcetype. If it is not, data will come in but the searches we run won’t find it.

  1. Go back to wherever the data input was defined. (This might be a UF or HF or if you set up just a standalone indexer, the indexer.)
  2. Look at the data input settings. The sourcetype for the CDR data should be “cucm_cdr”.
    1. If you are looking at the file “inputs.conf” it should be a line like
      sourcetype = cucm_cdr
  3. If it is set to any other sourcetype, this is the problem.  Change the sourcetype to “cucm_cdr” and restart that Splunk instance.

If you made any changes or corrections, wait for a few calls to happen and check again from the beginning of this troubleshooting section.

If everything looks OK so far, continue:

If the  sourcetype is correct, continue by checking the source files.

  1. Go back to wherever the data input was defined. (This might be a UF or HF or if you set up just a standalone indexer, the indexer.)
  2. Look at the data input settings in inputs.conf.  Find the type of input you have set up, and the path of the directory being monitored.  In the below example, it’s a batch/sinkhole input and the path is /var/log/remote/<some path>/:
    [batch:///var/log/remote/<some path>/cdr_*]

There are two options here.  The recommended way, and the style  outlined in the step above, is a batch/sinkhole input.  This means Splunk watches the folder involved and, when a new file gets created by the SFTP software, Splunk reads that file, sends it off to the indexer and then deletes it.

It is possible you have a standard monitor input with the stanza looking like

    [monitor:///var/log/remote/<some path>/cdr_*]

If that’s the case, Splunk won’t delete those files automatically. Instead, it just watches for them to be created or changed and sends that data in.

If you have a batch input:

  1. Change to that directory.
  2. Take a look at a directory listing.
  3. If it is full of files, this is the problem.  There are a variety of reasons these files won’t be picked up and deleted, listed below are some more common ones:
    1. Permissions are not correct and the Splunk user can’t delete the files?
    2. Some typo in the actual pathname?
    3. Filenames changed prefixes (no longer are the files named “cdr_…” but maybe they’re something different)?
    4. Lastly, maybe these aren’t really the files you need and it’s an accident they are there?
  4. If the directory is empty, then check one more thing:
    1. Stop the Splunk instance for a few minutes.
    2. Watch as calls finish, new files should be created once per minute.
    3. Confirm those files get created.
      1. If no files show up even when the Splunk instance is off, then this is your problem.  Call Manager is no longer sending files to your Splunk instance, the authentication Call Manager is using to SFTP the files is broken, or the SFTP server has been turned off.
    4. (If new files get created, go ahead and turn on the Splunk instance again and it will index and delete those files within a few minutes as if nothing had happened).

If you have a monitor input:

  1. Change to that directory.
  2. Take a look at a directory listing.
  3. If it is full of files, this is the problem.  Since you are using the unrecommended monitor input, you’ll have to continue troubleshooting on your own.  But common mistakes we’ve seen are:
    1. The script (or whatever is deleting the files outside of the Splunk HF) is no longer running.
    2. There never was a script set up and it just took this long before the operating system gave up?
    3. Permissions changed and the script that deletes/moves files can no longer do so?
    4. Filenames changed prefixes (no longer are the files named “cdr_…” but maybe they’re something different)?
  4. If it is empty (or very nearly so), this is the problem.
    1. Call Manager is no longer sending files to your Splunk instance,
    2. the authentication Call Manager is using to SFTP the files is broken,
    3. or the SFTP server has been turned off.

 


I set up the data input, but something is wrong. The homepage says “Not all fields were extracted properly”, and/or when I search the data in core Splunk UI the data is there but the fields look wrong.

If you’re using a Splunk forwarder, check that you installed the “TA-cisco_cdr” app on the forwarder, and restarted the forwarder after. If you did not, this is the problem.


I set up the FTP server but no files ever get sent over

Open an SFTP client and connect to the SFTP server using the same Auth that UCM is using.
Check whether you can transfer a file (any old file will do).

  • If you cannot – this is the problem.
    This is fairly common – you may have tested the SFTP server setup just by just connecting, but being able to open the connection and see the FTP root directory doesn’t mean the user account has any write permissions there. Check permissions and give that user account write permissions to that directory.
  • If you CAN transfer a file -(Remember to delete the file!)
    • Double check network issues from the UCM host to the SFTP server host.
      • Check that the UCM host can actually reach the host on which the SFTP server is installed. Check for any firewall software that may be running.
      • Check the “Billing Application Server” config inside CM Administration. Possibly there was a typo in the IP, or the auth, or a hostname was used that doesn’t work from there.
    • Restart the various Cisco CDR Services that are involved with this – it is rare to need this, but it does happen!
      • Log into the Cisco Unified Serviceability application using an administrator account.
      • Browse to Tools, then to Control Center – Network Services.
      • Select your CUCM server (Publisher) from the drop down asking which server to configure and click Go.
      • Browse that list and find the section CDR Services.  In there, try the following steps one by one, testing after each to see if the problem is resolved.
        • Click beside the Cisco CDR Repository Manager service to select it, then at the bottom of the page click Restart.  TEST to see if this has resolved the problem.
        • Click beside the Cisco CDR Agent service to select it, then at the bottom of the page click Restart.  TEST to see if this has resolved the problem.
        • Click beside the Cisco CAR Scheduler service to select it, then at the bottom of the page click Restart.  TEST to see if this has resolved the problem.

Everything works, but all my searches are incredibly slow

  1. Check for under-provisioned hardware.
    • CPU – You should have at least 8 cores per indexer and Splunk technically sets 12 as the minimum recommended.
      • NOTE in the virtual world you need to keep in mind loads, CPU Ready times, NUMA boundaries and a lot of other things.  Often times LESS is more in those cases!
    • RAM – 8GB RAM should be your absolute minimum and give it 12 or 16 if you can.
    • Disk -1200 iops isn’t found on a consumer grade 5400 RPM laptop drive, but that’s the recommended minimum.  To get that speed:
      • With 15,000 RPM disks, you need 6 or more in RAID 0, 12 or more in RAID 10, and neither R5 nor R6 is recommended at all due to write speed issues.
      • With 10,000 RPM disks, it takes perhaps 8 or more in RAID 0, 16 or more in RAID 10, and neither R5 nor R6 is recommended at all due to write speed issues.
      • With SSDs, generally a single one should be OK as long as it’s not super-cheap and junk, and if it’s on a reasonably fast bus (e.g. “not USB”).
      • AND – with virtual machines … all the above rules apply but then you have to take into account what access the host itself has to the volume/LUN that it can see.   Also it’s very common for storage in the virtual world to be oversubscribed by a lot.  Sharing those resources isn’t always the best use of them.
  2. Check for Accelerated Reports. Go to “Settings > Searches, reports and alerts” and check whether anyone turned on “report acceleration” for any of them. This will be evident as a lighting bolt in the little “lightning bolt” column =| . If you find any of these, check whether the owner is still using them and if not, turn off acceleration. We’re not sure why, but lately in Splunk 6.5.* and 6.6.* we’ve seen a couple cases where one accelerated report is sufficient to destroy search performance in general across the index. If this is the problem then the effect will be evident minutes after turning off acceleration.
  3. If you are using a “monitor” input (against our recommendations) and this is a standalone indexer, the script that is supposed to delete all the old files after 3 days or so, may have stopped working. If so, Splunk can spend virtually all of the hosts resources just checking the 100,000+ files for appended changes.

It’s working, but for some reason it’s falling behind – data doesn’t show up for many minutes/hours/days

  • If you’re using a “monitor” input (instead of the sinkhole input type that we recommend), then check the directory on the filesystem. If you see more than 10,000 or 50,000 files in there , this is the problem.
    If so, the script that is supposed to delete all the old files after 3 days or so may have stopped working. Splunk is spending most (or virtually all) of the hosts resources just opening each file and checking it for appended changes, as fast as it can. Getting out of this state can be tricky and it is best to contact Sideview Support
  • On the other extreme, if things are only behind by 1 or 10 minutes, it may just be the wrinkle that the CDR records don’t exist until the calls terminate. Make sure you’re not expecting to see data about a call that is still in progress.
  • And in the middle, there are a huge number of reasons why Splunk or the host it’s on might be really pressed for resources. Open the Monitoring Console by opening the “Settings” menu and clicking the giant “Monitoring Console” image on the left side of the menu. There’s a LOT to potentially check in here, but click around. If something looks bad it probably is. Things you might find – other apps on the box may be running ungodly numbers of scheduled searches or accelerated data models. Some users might have a bad habit of spinning up expensive realtime searches while they work.




If you have any comments at all about the documentation, please send it in to docs@sideviewapps.com.