UnDatasIO Docs
  • Getting Started
    • 😄Sign up for UnDatasIO
    • 😀Create UnDatasIO Task
    • Process individual files
      • For Python Developers
      • For Web Service
  • Vector database ingestion
    • Chromadb
    • Mongodb
    • Qdrant
    • Redis
    • Postgresql
    • LLM index qdrant
  • Example Sample
    • Undatasio tsla gen report02
    • Qwen2.5 math
Powered by GitBook
On this page
  • Leveraging UnDatas.io and DeepSeek to Analyze Tesla Gen Report 2: Intelligent Question Answering Unveiled
  • A recipe 🧑‍💻 📄 🤖
  • 👍1. Setting Up UnDatas.io and Integrating with OpenAI's deepseek
  • 🐥2. Extracting Valuable Information from Tsla-gen-report
  • 🥰3. Deep Dive into Intelligent Questioning
  1. Example Sample

Undatasio tsla gen report02

PreviousExample SampleNextQwen2.5 math

Last updated 5 months ago

👉

Leveraging UnDatas.io and DeepSeek to Analyze Tesla Gen Report 2: Intelligent Question Answering Unveiled

A recipe 🧑‍💻 📄 🤖

By xll, Tech Enthusiast @ Undatasio

In this notebook, we'll execute the following steps:

👍1. Laying the Foundation: Installing and Configuring

  • Install the Undatasio Python API library

  • Initialize the OpenAI object for deepseek integration

🤖2. Unearthing Hidden Gems: Data Extraction from Tsla-gen-report

  • Leverage the show_version function of the Undatasio object

📄3. Unlocking Insights: Deep Dive into Intelligent Questioning

  • Set up the deepseek-chat model

  • Pose targeted questions

To run this notebook successfully, you'll need:

👍1. Setting Up UnDatas.io and Integrating with OpenAI's deepseek

Installing the Undatasio Python API library

!pip install -U -q openai undatasio
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 389.9/389.9 kB 7.2 MB/s eta 0:00:00
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 9.0 MB/s eta 0:00:00
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 19.6 MB/s eta 0:00:00
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 195.8/195.8 kB 6.5 MB/s eta 0:00:00
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 298.0/298.0 kB 5.5 MB/s eta 0:00:00
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 26.4 MB/s eta 0:00:00
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 49.5/49.5 kB 3.1 MB/s eta 0:00:00

To import an UnDataIO object, you need a token and an optional task name from the Undatasio platform.

from undatasio.undatasio import UnDatasIO

undatasio_obj = UnDatasIO(UNDATASIO_API_KEY, task_name='PdfParserDemo')

🐥2. Extracting Valuable Information from Tsla-gen-report

The show_version function of the generated Undatasio object can display all version information and file lists for the current token's task name.

version_data = undatasio_obj.show_version()
version_data.data
title	version	count	file_name
0	1 files	v4	1	[tsla-20241023-gen_test02.pdf]
1	1 files	v3	1	[tsla-20241023-gen_test2.pdf]
2	1 files	v2	1	[tsla-20241023-gen_test.pdf]
3	4 files	v1	4	[PureTable.pdf, PureFormula.pdf, EditablePureT...

You can use the get_result_type method of the Undatasio object to retrieve text information, images, tables, titles, or interline equation information from a PDF file.

This notebook will select the table in the first Financial Summary of tsla-20241023-gen.pdf as the sample content, as shown in the figure above.

result = undatasio_obj.get_result_type(
    type_info=['title', 'table', 'text', 'image', 'interline_equation'],
    # type_info=['table'],
    file_name='tsla-20241023-gen_test02.pdf',
    version='v4'
)
print(result.data)
CORE TECHNOLOGY
ArtificialIntelligenceSoftwareandHardware
InQ3,wereleased the12.5seriesofFSD(Supervised)withimprovedsafetyandcomfort
thankstoincreased data andtraining compute,a5xincreaseinparametercount,andother
architectural choicesthatweplantocontinuescalinginQ4.WereleasedActuallySmart
Summon,whichenablesyourvehicletoautonomouslydrivetoyouinparkinglots,andFD
(Supervised)toCybertruckcustomers,includingend-to-endneuralnetsforhighwaydriving
forthefirst time.Wedeployed andaretraining aheadofschedule ona29kH100clusterat
GigafactoryTexas-whereweexpecttohave50kH100capacitybytheendofOctober
VehicleandOtherSoftware
OurSummer ReleaseincludedYouTubeandAmazonMusicasnative apps.Parentscannow
enableParental ControlsviaPINtoapplymaximumspeedlimit,reduceaccelerationtoChill
force-enablesafetysettings andenablecurfewnotifications.Othernewfeaturesinclude
Hands-FreeFrunk,revampedclmatecontrolsforModel3andModelY,weatherforecast
airqualityandimprovementstoin-vehiclenavigation
Battery,PowertrainandManufacturin
InOctober,weunveiledourCybercabandRobovanvehicles,bothdesignedfromthe
groundupforautonomy-withoutasteeringwheelorpedals.Cybercabwillbebuiltonour
next-generationplatformwhichincludesanewpowertrainwithanestimated5.5mi/kWh
Thiswillbeourmostefficientpowertrainyet.lnQ3,weproducedour100-millionth4680cell
andcontinuedtoprogressourdry-cathodemanufacturinglines
https://backend.undatas.io/static/pdfParser/af9151196a7d4daf8f919653a068b97f/v4/38189d496035423d90b25a223ae5aa06/images//efcc70e68d2fdbef93ef8969dbf99c45e25a97a173f70f1e74d74da4f7a20f2f.jpg


https://backend.undatas.io/static/pdfParser/af9151196a7d4daf8f919653a068b97f/v4/38189d496035423d90b25a223ae5aa06/images//fdeb8230e24ed98a60f987ca59be3ce11670b6c7efc4ff6fc1a886912356bc3f.jpg

TeslaAl trainingcapacityramp throughendof year(H100 equivalentGPUs)
OTHER HIGHLIGHTS
OurEnergy andServices andOtherbusinessesarebecomingincreasinglyprofitablepartsof
Tesla.Asenergy storageproducts continuetoramp andourvehiclefleetcontinuestogrow,
weare expectingcontinuedprofitgrowthfromthesebusinessesovertime
EnergyGenerationandStorage
TheEnergybusinessachievedarecordgrossmarginof30.5%inQ3,asequentialincrease
of596bps,despitelowerMegapackvolumes.Powerwallachievedrecorddeploymentsin
Q3forthesecondquarterinarow.RampofPowerwall3andtheLathropMegafactory
continuedsuccessfully-withLathropdemonstrating200Megapackproduction(40GWh
annualrunrate)inasingleweek.AsofQ3,over100,000PowerwallswereenrolledinVirtual
PowerPlantprograms,deliveringadditionalfinancialvaluetoownerswhileprovidingmuch
neededsupporttothegridduringperiodsofstress.TheShanghaiMegafactoryremainson
tracktobeginshippingMegapacksinQ12025
ServicesandOther
TheServicesandOtherbusinessachievedarecordgrossprofitinQ3,growingover90\%
year-on-year.Sequentialgrowthingrossprofitwasdrivenmostlybyhighergrossprofit
generationfromsupercharging,servicecentermarginimprovementandhighergrossprofit
generationfromPartsSalesandMerchandise.OurSuperchargernetworkcontinuedto
expandinQ3withover2,800newstallsinthequarter,a22\%growthofthenetworkYoY
https://backend.undatas.io/static/pdfParser/af9151196a7d4daf8f919653a068b97f/v4/38189d496035423d90b25a223ae5aa06/images//3f885a63c07562193088dd6ee32efa8c271aeaa51ced0df92724021974fdce1d.jpg


https://backend.undatas.io/static/pdfParser/af9151196a7d4daf8f919653a068b97f/v4/38189d496035423d90b25a223ae5aa06/images//097181937d6be1809384beefae0340505e1360951b99cc4905dcddaefe892b82.jpg

🥰3. Deep Dive into Intelligent Questioning

Initialize the OpenAI object. You need to apply for an API key yourself.

from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("API_KEY"),
    base_url="https://api.deepseek.com"
)

Use deepseek-chat model and set the system and user prompts.

Question 1: What new features were included in the Summer Release of Tesla's vehicle software?

We ask a question over the parsed markdown and get back the right answer! We also ask a question over the text.

query1 = "What new features were included in the Summer Release of Tesla's vehicle software?"
response = client.chat.completions.create(
        model="deepseek-chat",
        messages=[
            {"role": "system", "content": "You are a data analysis expert. Please extract information from the data provided by the user. Note that only the information asked by the user should be returned, and nothing else should be returned. Data: %s" % (result.data, )},
            {"role": "user", "content": query1},
        ],
        stream=False
    )
res_data = response.choices[0].message.content
res_data
"The new features included in the Summer Release of Tesla's vehicle software are:\n- YouTube and Amazon Music as native apps.\n- Parental Controls via PIN to apply maximum speed limit, reduce acceleration to Chill, force-enable safety settings, and enable curfew notifications.\n- Hands-Free Frunk.\n- Revamped climate controls for Model 3 and Model Y.\n- Weather forecast and air quality improvements in-vehicle navigation."

show the result above:

The new features included in the Summer Release of Tesla's vehicle software are:

  • YouTube and Amazon Music as native apps.

  • Parental Controls via PIN to apply maximum speed limit, reduce acceleration to Chill, force-enable safety settings, and enable curfew notifications.

  • Hands-Free Frunk.

  • Revamped climate controls for Model 3 and Model Y.

  • Weather forecast and air quality improvements in-vehicle navigation.

Question 2: How did the Energy business perform in Q3 in terms of gross margin?

Let's also try asking a question over another piece of the text.

query2 = 'How did the Energy business perform in Q3 in terms of gross margin?'
response = client.chat.completions.create(
        model="deepseek-chat",
        messages=[
            {"role": "system", "content": "You are a data analysis expert. Please extract information from the data provided by the user. Note that only the information asked by the user should be returned, and nothing else should be returned. Data: %s" % (result.data, )},
            {"role": "user", "content": query2},
        ],
        stream=False
    )
res_data = response.choices[0].message.content
res_data
'The Energy business achieved a record gross margin of 30.5% in Q3, a sequential increase of 596 basis points.'

This is a code recipe that harnesses the power of the platform and the advanced to unlock crucial insights from the Tesla Gen report.

Pinpoint key data: This notebook zeroes in on the table within the first Financial Summary of as our prime sample content. As illustrated in the figures accompanying this recipe, this data will serve as the foundation for our in-depth exploration, holding valuable financial and operational details about Tesla's business.

An procured from the official Undatas.io platform. This key unlocks the platform's full suite of data extraction and manipulation features.

An obtained following OpenAI's official process. Ensure you've completed all necessary registration and verification steps to gain access to the deepseek model via OpenAI's infrastructure.

Undatas.io
deepseek model
tsla-20241023-gen.pdf
Undatas.io API key
OpenAI API key
NoteBook is here!