XML Schema
XML Schema
Are there any XSD schema files available for the API data? Really trying to get the gamedetail data formatted in my pitch location query app. Unfortunately the only way I can easily get the data to flatten nicely is to open it in Excel and then copy/paste to the database; which is by no means capable of being cleanly automated. When I try to OLEDB query the XML data directly, it separates the nests into 12 different tables that lose their relation.
- FTMSupport
- Site Admin
- Posts: 13193
- Joined: Sat Mar 28, 2009 7:25 pm
Re: XML Schema
No, we do not maintain an XSD schema. You can write an XSL to translate the data to a flat format if you want.
Check out the new iScore Baseball documentation page!
Includes videos and user manual.
http://iscoresports.com/baseball/training.php#docs
Includes videos and user manual.
http://iscoresports.com/baseball/training.php#docs
Re: XML Schema
I've been a programmer for over 15 years and have never felt more like an idiot trying to figure out how to use your API data; namely the "gamesdetail" output. As near as I can tell, there is nothing relational in the data. Each nest appears to output to its own table, but does not include the primary key for the parent. I have given up on multiple occasions and just tried importing the data to my database using excel, but I've discovered that in some of the games there are additional columns which don't mach my existing columns. Am I missing something. Surely it can't be this difficult.
Note: My background is in SQL Server, Oracle, SAP, and Visual Studio (VB.net)
Note: My background is in SQL Server, Oracle, SAP, and Visual Studio (VB.net)
- FTMSupport
- Site Admin
- Posts: 13193
- Joined: Sat Mar 28, 2009 7:25 pm
Re: XML Schema
The XML parent/child hierarchy implies the relationship --- there is no need to repeatedly include a key for the parent node as the relationship implies it. So if you have:
The parent/child relationship of the nodes shows that the pitches belong to the BATTER node they are a child of, and the EVENT nodes belong to the PITCH nodes they are children of. Including a primary key as data just makes the XML more verbose than it needs to be.
If you have a specific question about the XML, you can email us. The gamedetail is intentionally not documented in the API --- it is available, and is being used by several third parties with great success, but IS complicated and something that we debated whether to even make available as we can not afford to spend time answering questions around the complexity.
Code: Select all
<BATTER ...>
<PITCH ...>
<SITUATION ...>
<EVENT ...>
<PITCH ...>
<SITUATION ...>
<EVENT ...>
If you have a specific question about the XML, you can email us. The gamedetail is intentionally not documented in the API --- it is available, and is being used by several third parties with great success, but IS complicated and something that we debated whether to even make available as we can not afford to spend time answering questions around the complexity.
Check out the new iScore Baseball documentation page!
Includes videos and user manual.
http://iscoresports.com/baseball/training.php#docs
Includes videos and user manual.
http://iscoresports.com/baseball/training.php#docs
- CSThunderCoach
- Posts: 263
- Joined: Thu Mar 04, 2010 8:55 am
- Location: Colorado Springs, CO
Re: XML Schema
I have been working with the gamedetail XML data for more than a year. If you need any specific questions answered I might be able to provide you with some information. Like Support has indicated, it is a complex dataset that you cannot simply import into a database - you will need to develop your own XSLs or secondary APIs to extract the data and build the DB relationships.
Re: XML Schema
So given that there doesn't seem to be a relational approach to extract the gamedetail via a standard provider, I've taken a first stab at writing a parser, looking at each node in the document and making some logical decisions for further looping. There is no error checking and the outputs are just console prints, but I thought it might be beneficial for others to see and perhaps expound upon. It is my intent that each "record" in each node level will be saved to a node specific table in my database, with the ID from the parent node. This will fairly easily allow for complex queries to be run against the data.
(VB.net 2010)
(VB.net 2010)
Code: Select all
Private Sub getParseXMLGameDetail()
Dim XMLFile As String = txtXMLFile.Text
Dim xmlDoc As New XmlDocument
xmlDoc.Load(XMLFile)
Dim node As XmlNode = xmlDoc.SelectSingleNode("/GAME")
If node IsNot Nothing Then
Debug.Print(node.Attributes("guid").Value)
Debug.Print(node.Attributes("name").Value)
For Each RootLevel As XmlNode In node.ChildNodes
Select Case RootLevel.Name
Case Is = "TEAM"
Debug.Print(RootLevel.Attributes("guid").Value)
Debug.Print(RootLevel.Attributes("side").Value)
Debug.Print(RootLevel.Attributes("name").Value)
For Each TEAMLevel As XmlNode In RootLevel.ChildNodes
Select Case TEAMLevel.Name
Case Is = "PLAYER"
Debug.Print(TEAMLevel.Attributes("guid").Value)
Debug.Print(TEAMLevel.Attributes("lastname").Value)
Debug.Print(TEAMLevel.Attributes("firstname").Value)
Debug.Print(TEAMLevel.Attributes("bats").Value)
Debug.Print(TEAMLevel.Attributes("throws").Value)
Debug.Print(TEAMLevel.Attributes("jersey").Value)
End Select
Next
Case Is = "INNING"
Debug.Print(RootLevel.Attributes("num").Value)
Debug.Print(RootLevel.Attributes("tb").Value)
For Each INNINGLevel As XmlNode In RootLevel.ChildNodes
Debug.Print(INNINGLevel.Attributes("lineuppos").Value)
Debug.Print(INNINGLevel.Attributes("guid").Value)
For Each BATTERLevel As XmlNode In INNINGLevel.ChildNodes
Select Case BATTERLevel.Name
Case Is = "PITCH"
Debug.Print(BATTERLevel.Attributes("id").Value)
Debug.Print(BATTERLevel.Attributes("time").Value)
Debug.Print(BATTERLevel.Attributes("ms").Value)
Debug.Print(BATTERLevel.Attributes("pitcher").Value)
Debug.Print(BATTERLevel.Attributes("type").Value)
Debug.Print(BATTERLevel.Attributes("speed").Value)
Debug.Print(BATTERLevel.Attributes("batterhand").Value)
Debug.Print(BATTERLevel.Attributes("pitcherhand").Value)
Debug.Print(BATTERLevel.Attributes("x").Value)
Debug.Print(BATTERLevel.Attributes("y").Value)
For Each PITCHLevel As XmlNode In BATTERLevel.ChildNodes
Select Case PITCHLevel.Name
Case Is = "SITUATION"
Debug.Print(PITCHLevel.Attributes("visitor").Value)
Debug.Print(PITCHLevel.Attributes("home").Value)
Debug.Print(PITCHLevel.Attributes("balls").Value)
Debug.Print(PITCHLevel.Attributes("strikes").Value)
Debug.Print(PITCHLevel.Attributes("outs").Value)
Debug.Print(PITCHLevel.Attributes("first").Value)
Debug.Print(PITCHLevel.Attributes("second").Value)
Debug.Print(PITCHLevel.Attributes("third").Value)
Debug.Print(PITCHLevel.Attributes("pitcherballs").Value)
Debug.Print(PITCHLevel.Attributes("pitcherstrikes").Value)
Case Is = "EVENT"
Select Case PITCHLevel.Attributes.Count
Case Is = 2
Debug.Print(PITCHLevel.Attributes("type").Value)
Debug.Print(PITCHLevel.Attributes("desc").Value)
Case Is = 4
Debug.Print(PITCHLevel.Attributes("type").Value)
Debug.Print(PITCHLevel.Attributes("desc").Value)
Debug.Print(PITCHLevel.Attributes("player").Value)
Debug.Print(PITCHLevel.Attributes("base").Value)
Case Is = 6
Debug.Print(PITCHLevel.Attributes("type").Value)
Debug.Print(PITCHLevel.Attributes("desc").Value)
Debug.Print(PITCHLevel.Attributes("x").Value)
Debug.Print(PITCHLevel.Attributes("y").Value)
Debug.Print(PITCHLevel.Attributes("hit_type").Value)
Debug.Print(PITCHLevel.Attributes("hit_strength").Value)
End Select
Case Is = "RESULT"
Debug.Print(PITCHLevel.Attributes("visitor").Value)
Debug.Print(PITCHLevel.Attributes("home").Value)
Debug.Print(PITCHLevel.Attributes("balls").Value)
Debug.Print(PITCHLevel.Attributes("strikes").Value)
Debug.Print(PITCHLevel.Attributes("outs").Value)
Debug.Print(PITCHLevel.Attributes("first").Value)
Debug.Print(PITCHLevel.Attributes("second").Value)
Debug.Print(PITCHLevel.Attributes("third").Value)
Debug.Print(PITCHLevel.Attributes("pitcherballs").Value)
Debug.Print(PITCHLevel.Attributes("pitcherstrikes").Value)
Case Is = "LINEUP"
Debug.Print(PITCHLevel.Attributes("tguid").Value)
For Each LINEUPLevel As XmlNode In PITCHLevel.ChildNodes
Select Case LINEUPLevel.Name
Case Is = "POS"
Debug.Print(LINEUPLevel.Attributes("guid").Value)
Debug.Print(LINEUPLevel.Attributes("bat").Value)
Debug.Print(LINEUPLevel.Attributes("field").Value)
End Select
Next
End Select
Next
End Select
Next
Next
Case Else
End Select
Next
End If
End Sub
- CSThunderCoach
- Posts: 263
- Joined: Thu Mar 04, 2010 8:55 am
- Location: Colorado Springs, CO
Re: XML Schema
You have a good start going here.
A couple of things to keep in mind with the EVENT node:
There can and usually is more than 1 event node.
Event Nodes may have child nodes that indicate the defensive players involved in the play. One FIELDER node for each fielder involved in the play.
If I remember correctly NOTE nodes appear within the PITCH nodes as well.
A couple of things to keep in mind with the EVENT node:
There can and usually is more than 1 event node.
Event Nodes may have child nodes that indicate the defensive players involved in the play. One FIELDER node for each fielder involved in the play.
If I remember correctly NOTE nodes appear within the PITCH nodes as well.
Code: Select all
<EVENT y="98" x="-35" type="LOC" desc="" hit_strength="2" hit_type="2"/>
<EVENT type="LD" desc="Line Drive" fielders="6">
<FIELDERS>
<FIELDER player="" position="6" seq="1"/>
</FIELDERS>
</EVENT>
-
- Posts: 46
- Joined: Mon Aug 30, 2010 4:49 am
Re: XML Schema
Hi,
I have also been working on parsing the gamedetail (using php). I am wondering how you guys have set up the database schema as there are problably many ways to do so. If you would like to share your schema that would be nice!
JB
I have also been working on parsing the gamedetail (using php). I am wondering how you guys have set up the database schema as there are problably many ways to do so. If you would like to share your schema that would be nice!
JB
- CSThunderCoach
- Posts: 263
- Joined: Thu Mar 04, 2010 8:55 am
- Location: Colorado Springs, CO
Re: XML Schema
The DB Schema is going to depend on what you are using the data for. You can create a schema that is highly de-normalized which will give you the greatest flexibility in creating all kinds of customized reports.
Mine is pretty simple in that I am only trying to get to specific additional stats.
Events = all event codes (both from FTM and some of my own that I use for calculations)
Games = all games played with standard information (runs, dates, home & away teams, notes, etc)
Players = all players with a reference to each team that they have played on
Fielding = each position that a player has played and the number of innings at each position (primary key is used to separate the stats for each position)
FStats = fielding stats by player/fielding position
BStats = batting stats by batter/pitcher (used to provide hit charts for both hitters and pitchers [hits against])
PStats = pitching stats by pitcher (standard ER, BF, BB, etc.) Pitcher v Batter stats I get from BStats.
Even with this small schema the DB size can grow quite large in a hurry. If you have 12 players, and players move to different positions each inning, the FStats table gets big quick
Mine is pretty simple in that I am only trying to get to specific additional stats.
Events = all event codes (both from FTM and some of my own that I use for calculations)
Games = all games played with standard information (runs, dates, home & away teams, notes, etc)
Players = all players with a reference to each team that they have played on
Fielding = each position that a player has played and the number of innings at each position (primary key is used to separate the stats for each position)
FStats = fielding stats by player/fielding position
BStats = batting stats by batter/pitcher (used to provide hit charts for both hitters and pitchers [hits against])
PStats = pitching stats by pitcher (standard ER, BF, BB, etc.) Pitcher v Batter stats I get from BStats.
Even with this small schema the DB size can grow quite large in a hurry. If you have 12 players, and players move to different positions each inning, the FStats table gets big quick
Re: XML Schema
On the EVENTS data, here's what I'm seeing. Does my logic seem right? I count the attributes under the EVENT node to determine what type of event occurs:
2 Attributes = Pitcher Events
3 Attributes = Fielder Events
-------> Sub Node FIELDERS
4 Attributes = Runner Events
5 Attributes = Field Events
-------> Sub Node FIELDERS
6 Attributes = Hit Events
2 Attributes = Pitcher Events
3 Attributes = Fielder Events
-------> Sub Node FIELDERS
4 Attributes = Runner Events
5 Attributes = Field Events
-------> Sub Node FIELDERS
6 Attributes = Hit Events