Turbocharging Math Problem-Solving with Undatas.io and a Qwen-max Model
A recipe 🧑💻 📄 🔢
By xll, Tech Enthusiast @ Undatasio
This is a code recipe meticulously crafted to turbocharge the process of solving math problems extracted from PDF test papers by synergistically integrating the powerful Undatas.io platform with a cutting-edge large language model.
In the preceding notebook, we provided a preliminary demonstration problem-solving efficacy using elementary school math test paper questions. This time, the example will employ a junior high school math test paper from the American "Math League" Mind Exploration to further showcase the capabilities and potential of this approach.
In this notebook, we'll take these essential steps:
1. Bootstrapping with Undatas.io Setup
Install the Undatasio Python API library
UncoverUndatas.io's capabilities
Harness the get_result_type method
2. Extracting Data from Math Test paper
Leverage the show_version function of the Undatasio object
Pinpoint key data: the math problems
3. Bridging to Qwen Model via OpenAI
Install the OpenAI Python SDK
Initialize the OpenAI object
4. Solving Math Problems with Qwen-max
Configure the Bailian Qwen-max model
Query the qwen2.5-math-72b-instruct model
To run this notebook without a hitch, you'll need:
An Undatas.io API key sourced directly from the officialUndatas.io platform. This key unlocksUndatas.io's data extraction and manipulation arsenal, vital for our task.
An Alibaba Cloud API key obtained by following Alibaba Cloud's official protocol. This key is your entry pass to the Qwen model via OpenAI's infrastructure.
This file demonstrates using the Undatasio platform and a large language model to answer math questions from a PDF test paper.
You can use the get_result_type method of the Undatasio object to retrieve text information, images, tables, titles, or interline equation information from a PDF file.
In [ ]:
result = undatasio_obj.get_result_type(
type_info=['title', 'table', 'text', 'image', 'interline_equation'],
file_name='Math League First Round Grade 7 (Questions & Answers).pdf',
version='v6'
)
print(result.data)
1. Drawing the diagonals of a rectangle creates exactly\it-triangles
A) 2B)4C)6D)&
2.The least possible average of 2017 different positive integers is
A)1008B)1009C)2017D)2018
3.Thereweresevenfriendswho decidethat theywould alldinetogethereveryeveningif theycouldsit in
a different arrangementeach time.Theywoulduse thesametable,alwayswithseven chairsin thesame
spots.(Two arrangements are considered identical if and only ifeveryone of the seven friends sits on the
samechair.)Howmanydinnerscouldthesevenofthemeatbeforeexhaustingallpossible
arrangements?
A)2520B)5040C)720D)1440
4. Increasing a number by20\%is the same as multiplying it by
A)20\%B)80\%C)120\%D)200\%
5.$100 in nickels is\underbrace{\phantom{R D_{\theta}^{(1)}}2}_{\phantom{(1)}}morecoins than\S100in dimes
A)100B)200C)1000D)2000
6. What is the range of any 2018 consecutive integers?
A)1009B)2017C)2018D) 2019
7.Written asa decimal,\frac{123456789}{100}hasexactly\risingdotseqnon-zero digits to the right of the decimal point.
https://backend.undatas.io/static/pdfParser/af9151196a7d4daf8f919653a068b97f/v6/ed4c7e1d9de94756b4bebffa18ae6f53/images/f642189a8685f75c0704c9d13c2fed6f13a87a2990724bba5672a719d39bc856.jpg
A) 2B)3C)6D) 7
8.Each choir member sang 1 song alone and 2 songs with the entire choir. If 24 songs were
sunginall,thechoirmusthave\_\underline{{?}}members.
A)8B)1IC) 12D) 22
9.A multiple of2017is divided by a multipleof 2018.What is theleast remainderpossible?
https://backend.undatas.io/static/pdfParser/af9151196a7d4daf8f919653a068b97f/v6/ed4c7e1d9de94756b4bebffa18ae6f53/images/7b7f07ad25760633be1db0e6a7a28ea02985977a70ea2f3fb4cafbbbd8927216.jpg
A)0B) 1C)2017D)2018
10.My armful of identicalgumballsweighs4\%less since I dropped onc gumball. How many
gumballsareinmyarmsnow?
A) 23B) 24C)25D)26
11.Thedigitsoftheleast 2-digit integer that is aperfectsquare and aperfectcubehave thesum
A)7B) 8C)9D) 10
12. The year in which my grandfather was born, a perfect square, when subtracted from the year in which
mydaughterwasborn,anotherperfectsquare,givesmygrandfather'sagewhenhedied.Ifmy
grandfatherhadlived,Iwouldhavebeenexactlyhalfhisagein1943.HowoldwasIin1943?
A)42B) 44C)46D)Noneoftheabove
13.Amanhadtwohorses.Hesold oneof them onTuesdayfor\mathbb{S}198andmadeaprofitoftenpercent.On
Wednesday,he sold the other one for $198 and took a loss of tenpercent.Tallyinguphis two deals,did
heshowanetprofitoraloss?
A)EvenB)Anetprofitof\S6
C)A netprofit of $4D)Anetlossof$4
14.Thesumofthelengthsof all theedgesofa cube is144\;\mathrm{cm}Whatistheareaofonefaceofthecube?
A)144~\mathrm{cm}^{2}B) 196 cm²C)256cm²D)324 cm²
15.Thetime815minutesafter8:15P.M.is
A)3:15A.M.B) 9:50 A.M.C)3:15P.M.D) 9:50 P.M.
16.Thenumber180has\underline{{?}}moredivisorsthan thenumber120has
A)0B) 2C)30D)60
17. The8 houses on my street have consecutive integer addresses that add up to 1500.The address with the
greatestnumericalvalueis
A)184B)187C)188D)191
18. Which of these fractions is the sum of an integer and its reciprocal?
A)\frac{7}{3}B)\frac{8}{3}C)\frac93D)\frac{10}{3}
19.Themixednumber2{\frac{1}{4}}is equivalent to many improper fractions that have integer numerators and
A) 24B) 27C)36D)45
20.At my store,\S1of every\S5in sales is profit. If I split10\%of all profits equally among
10 people,each gets\perp\!\%of thetotalsales.
A) 0.2B)2C) 5D)20
https://backend.undatas.io/static/pdfParser/af9151196a7d4daf8f919653a068b97f/v6/ed4c7e1d9de94756b4bebffa18ae6f53/images/7751d69af2760c5d95fdfd07c08a1bd6a51ee0c60b7e4ef5a04495ddd01882d1.jpg
21.If Mary is twice as old as Annwas whenMary was as old as Ann is now, and Mary is 32,how old is
Ann?
A)20B)24C)32D) None of the above
22.Ofthefollowing,whichexpressionhastheleastvalue?
\frac{3^{100}}{4}100\frac34\mathrm{D)}\ \frac{3}{4^{100}}
C
B)
23. I randomly select a positive integer less than 100. The probability that it is the product of exactly 3
differentprimesis
A)\frac{1}{99}\mathrm{~B~}\mathrm{~\frac~{~4~}~{~99~}~}\qquad\qquad\mathrm{~C~}\mathrm{~\frac~{~5~}~{~99~}~}\qquad\qquad\mathrm{~D~}\mathrm{~\frac~{~8~}~{~99~}~}
24.Iftheaverageof3consecutiveticketnumbersisodd,thenthesumoftheleastand
greatestticketnumberscouldbe
A)18B)20C)24D)28
https://backend.undatas.io/static/pdfParser/af9151196a7d4daf8f919653a068b97f/v6/ed4c7e1d9de94756b4bebffa18ae6f53/images/015cf4ac3a2033e4cde49cab670729fae45a694b5b6e4a38b0e5579fa82eced2.jpg
25.Evecountedto4^{60}by consecutive powers of2, starting with2^{1},22,23,....How many powers of2 did
Evecount?
A)30B)120C)240D)3600
26.Howmanyevenintegersbetween1 and 1 000 000have digits that are all primes?
A)1365B) 3906C)5400D)19530
27.If6identicalmachinescanfill80bottlesofsodain12seconds,howmanysecondswouldit
take36ofthesamemachinestofill240bottlesofsoda?
A) 6B)12C)18D)24
28.Ofmy100favoritereleasedsongs,42\%werereleasedaftertheyear2015and76\%were
releasedbefore theyear2017.What percent of myfavorite songswerereleasedin2016?
A)18\%\mathrm{B)~24\%~\qquad~\qquad~C)~34\%~\qquad~\qquad~D)~58\%}
https://backend.undatas.io/static/pdfParser/af9151196a7d4daf8f919653a068b97f/v6/ed4c7e1d9de94756b4bebffa18ae6f53/images/7ef819c441e3c891cf5114e5cb5eeaee21b1807a30ca1959a5c2694133602ecf.jpg
29.(The number of positive even integersless than10°that areperfect squares):(thenumberofpositive
odd integerslessthan10^{6}that areperfectsquares)=
A) 1:1B)2:1C) 499:500D) 999:1000
30.Ofthefollowing,whichis a multiple of4?
2017^{2018}+1\mathrm{~\quad~B~})\,2017^{2018}+3\mathrm{~\quad~C~})\,2017^{2018}+5\mathrm{~\quad~D~})\,2018^{2017}+1
31.If the sum of the measures of two angles ofa parallelogram is 108degrees,the sum of themeasures of
threeofitsanglescouldbe
A)72degreesB)162 degreesC)234 degreesD)252 degrees
32.Mr.Einsteinhatesrepetition.Heeatsatarestaurantnearhishouseonceeveryday.Onthemenu ofthis
restaurant,thereare11appetizers,26entrees,and12kindsofdesserts.Inaddition,thereare12wine
selectionsoffered.Mr.Einsteininsiststhateverydayheeatsadifferentmealcombinationthathasnever
beenservedtohimbefore.Eachmealcombinationconsistsofoneitemfromeachofthefourcategories
ForhowmanyyearscanMr.Einsteineatatthisrestaurant?
A)5B)20C)25D) Over 100
33.Inthecompleteexpansionof(x+1)^{4}whatisthesumofthecoefficientsoftheoddpowersofx?
A) 4B) 6C)8D)10
34.Amanstartswith\mathbb{S}10000andincreases hiswealthby50percent every three years.Howmuchwill he
havein12years?
A)\mathbb{S}30000B)$50625C)\mathbb{S}70000D)Noneoftheabove
35.What is the sum of all positive two-digit integers which are divisible by both the sum and product of
theirdigits?
A)36B)54C)72D) None of the above
36.Ifnis the smallest positive integer such that99nis the cube of an integer, anddis the sum of the digits
ofn,thendis
A)27B) 18C) 12D) 9
37.The area of my rectangle is 480.If my rectangle's length is14 greater than itswidth,then its perimeter is
A)88B)92C)116D)172
38.If{\frac{4}{x}}\!<\!12,which of the followingmust alwaysbe true?
A)x>3\mathrm{~B~)~}\ x>\frac{1}{3}\qquad\qquad\mathrm{~C~)~}\ \frac{1}{x}<3\qquad\qquad\mathrm{~D~)~}\ \frac{1}{x}<\frac{1}{3}
39.Ifx+y=aandx y=b,thenwhat is thevalue ofx^{3}+y^{3}in termsofaandb
\begin{array}{l}{\operatorname{A})\,a^{3}+3a b}\\ {\operatorname{C})\,a^{3}+b^{3}}\end{array}\begin{array}{l}{{\mathrm{B})\;a^{3}-3a b}}\\ {{\mathrm{D})\;a^{3}-b^{3}}}\end{array}
40.IfIsubtractthesquareofoneintegerfromthesquareofanotherinteger,thenthedifferencecouldbe
A)386B)558C)768D)970
<table style="border-collapse: collapse;"><tr style="border: 1px solid black;"><td style="border: 1px solid black; padding: 5px;">1
</td><td style="border: 1px solid black; padding: 5px;"></td><td style="border: 1px solid black; padding: 5px;"></td><td style="border: 1px solid black; padding: 5px;">4
</td><td style="border: 1px solid black; padding: 5px;">S
</td><td style="border: 1px solid black; padding: 5px;">6
</td><td style="border: 1px solid black; padding: 5px;">7
</td><td style="border: 1px solid black; padding: 5px;">8
</td><td style="border: 1px solid black; padding: 5px;">6
</td><td style="border: 1px solid black; padding: 5px;">10
</td></tr><tr style="border: 1px solid black;"><td style="border: 1px solid black; padding: 5px;">D
</td><td style="border: 1px solid black; padding: 5px;">B
</td><td style="border: 1px solid black; padding: 5px;">B
</td><td style="border: 1px solid black; padding: 5px;">C
</td><td style="border: 1px solid black; padding: 5px;">C
</td><td style="border: 1px solid black; padding: 5px;">B
</td><td style="border: 1px solid black; padding: 5px;">A
</td><td style="border: 1px solid black; padding: 5px;">D
</td><td style="border: 1px solid black; padding: 5px;">A
</td><td style="border: 1px solid black; padding: 5px;">B
</td></tr><tr style="border: 1px solid black;"><td style="border: 1px solid black; padding: 5px;">11
</td><td style="border: 1px solid black; padding: 5px;">12
</td><td style="border: 1px solid black; padding: 5px;">13
</td><td style="border: 1px solid black; padding: 5px;">14
</td><td style="border: 1px solid black; padding: 5px;">15
</td><td style="border: 1px solid black; padding: 5px;">16
</td><td style="border: 1px solid black; padding: 5px;">17
</td><td style="border: 1px solid black; padding: 5px;">18
</td><td style="border: 1px solid black; padding: 5px;">19
</td><td style="border: 1px solid black; padding: 5px;">20
</td></tr><tr style="border: 1px solid black;"><td style="border: 1px solid black; padding: 5px;">D
</td><td style="border: 1px solid black; padding: 5px;">D
</td><td style="border: 1px solid black; padding: 5px;">D
</td><td style="border: 1px solid black; padding: 5px;">A
</td><td style="border: 1px solid black; padding: 5px;">B
</td><td style="border: 1px solid black; padding: 5px;">B
</td><td style="border: 1px solid black; padding: 5px;">D
</td><td style="border: 1px solid black; padding: 5px;">D
</td><td style="border: 1px solid black; padding: 5px;">A
</td><td style="border: 1px solid black; padding: 5px;">A
</td></tr><tr style="border: 1px solid black;"><td style="border: 1px solid black; padding: 5px;">21
</td><td style="border: 1px solid black; padding: 5px;">22
</td><td style="border: 1px solid black; padding: 5px;">23
</td><td style="border: 1px solid black; padding: 5px;">24
</td><td style="border: 1px solid black; padding: 5px;">25
</td><td style="border: 1px solid black; padding: 5px;">26
</td><td style="border: 1px solid black; padding: 5px;">27
</td><td style="border: 1px solid black; padding: 5px;">28
</td><td style="border: 1px solid black; padding: 5px;">29
</td><td style="border: 1px solid black; padding: 5px;">30
</td></tr><tr style="border: 1px solid black;"><td style="border: 1px solid black; padding: 5px;">B
</td><td style="border: 1px solid black; padding: 5px;">D
</td><td style="border: 1px solid black; padding: 5px;">C
</td><td style="border: 1px solid black; padding: 5px;">A
</td><td style="border: 1px solid black; padding: 5px;">B
</td><td style="border: 1px solid black; padding: 5px;">A
</td><td style="border: 1px solid black; padding: 5px;">A
</td><td style="border: 1px solid black; padding: 5px;">A
</td><td style="border: 1px solid black; padding: 5px;">C
</td><td style="border: 1px solid black; padding: 5px;">B
</td></tr><tr style="border: 1px solid black;"><td style="border: 1px solid black; padding: 5px;">31
</td><td style="border: 1px solid black; padding: 5px;">32
</td><td style="border: 1px solid black; padding: 5px;">33
</td><td style="border: 1px solid black; padding: 5px;"></td><td style="border: 1px solid black; padding: 5px;">35
</td><td style="border: 1px solid black; padding: 5px;">36
</td><td style="border: 1px solid black; padding: 5px;">37
</td><td style="border: 1px solid black; padding: 5px;">38
</td><td style="border: 1px solid black; padding: 5px;">39
</td><td style="border: 1px solid black; padding: 5px;">40
</td></tr><tr style="border: 1px solid black;"><td style="border: 1px solid black; padding: 5px;">C
</td><td style="border: 1px solid black; padding: 5px;">D
</td><td style="border: 1px solid black; padding: 5px;">C
</td><td style="border: 1px solid black; padding: 5px;">B
</td><td style="border: 1px solid black; padding: 5px;">C
</td><td style="border: 1px solid black; padding: 5px;">C
</td><td style="border: 1px solid black; padding: 5px;">B
</td><td style="border: 1px solid black; padding: 5px;">C
</td><td style="border: 1px solid black; padding: 5px;">B
</td><td style="border: 1px solid black; padding: 5px;">C
</td></tr></table>
2. Bridging to Qwen Model via OpenAI
Install the OpenAI Python SDK, which will be used later to call the Qwen model.
Initialize the OpenAI object. You need to apply for an Alibaba Cloud API key yourself.
In [ ]:
from openai import OpenAI
client = OpenAI(
api_key=os.getenv("DASHSCOPE_API_KEY"),
base_url="https://dashscope.aliyuncs.com/compatible-mode/v1"
)
In [ ]:
from openai import OpenAI
client = OpenAI(
api_key='sk-43445ad85a9f466eaf1a33a0cafa2a89',
base_url="https://dashscope.aliyuncs.com/compatible-mode/v1"
)
Use the Bailian Qwen-max model and set the system and user prompts.
In [ ]:
completion = client.chat.completions.create(
model="qwen-max",
messages=[
{'role': 'system', 'content': 'You are a text analyst. I will give you a test paper, and you need to return the text information of the specific question according to the question selected by the user. Please note that only the text information of the question asked by the user should be returned, and nothing else should be returned. Test paper: %s' % (result.data, )},
{'role': 'user', 'content': 'Please help me find the question 6.'}],
)
max_result = completion.model_dump_json()
Use the json module to serialize the object returned by the Qwen-max model and extract the math problem mentioned in the user prompt.
'What is the range of any 2018 consecutive integers?\nA)1009\nB)2017\nC)2018\nD)2019'
3. Solving Math Problems with Qwen-max
Use the OpenAI object to query the qwen2.5-math-72b-instruct model with the extracted math problem and serialize the response using json to get the final answer.
In [ ]:
completion = client.chat.completions.create(
model="qwen2.5-math-72b-instruct",
messages=[
{'role': 'system', 'content': 'You are a math teacher. Please conduct step-by-step reasoning and use {} to represent the final answer. Return the text in Markdown format.'},
{'role': 'user', 'content': 'Please solve the question:%s' % (title_text, )}],
)
math_result = completion.model_dump_json()
result = json.loads(math_result)['choices'][0]['message']['content']
print(result)
To determine the range of any 2018 consecutive integers, let's start by defining what we mean by "consecutive integers." Consecutive integers are integers that follow each other in order without any gaps. For example, the integers 1, 2, 3, and 4 are consecutive integers.
Let's denote the smallest of the 2018 consecutive integers by \( x \). Then the next 2017 integers will be \( x+1, x+2, x+3, \ldots, x+2017 \). Therefore, the 2018 consecutive integers can be written as:
\[ x, x+1, x+2, x+3, \ldots, x+2017 \]
The range of a set of numbers is the difference between the largest number and the smallest number in the set. In this case, the largest number is \( x+2017 \) and the smallest number is \( x \). So, the range is:
\[ (x+2017) - x = 2017 \]
Therefore, the range of any 2018 consecutive integers is \(\boxed{2017}\).
The correct choice is \(\boxed{B}\).
Let's also try asking a question over another piece of the MATHEMATICS CONTEST
In [ ]:
completion = client.chat.completions.create(
model="qwen-max",
messages=[
{'role': 'system', 'content': 'You are a text analyst. I will give you a test paper, and you need to return the text information of the specific question according to the question selected by the user. Please note that only the text information of the question asked by the user should be returned, and nothing else should be returned. Test paper: %s' % (result.data, )},
{'role': 'user', 'content': 'Please help me find the question 7.'}],
)
max_result = completion.model_dump_json()
title_text = json.loads(max_result)['choices'][0]['message']['content']
title_text
Out[ ]:
'Written as a decimal, \\(\\frac{123456789}{100}\\) has exactly \\(\\underline{{?}}\\) non-zero digits to the right of the decimal point.\n\nA) 2 \nB) 3 \nC) 6 \nD) 7'
Use the OpenAI object to query the qwen2.5-math-72b-instruct model with the extracted math problem and serialize the response using json to get the final answer.
In [ ]:
completion = client.chat.completions.create(
model="qwen2.5-math-72b-instruct",
messages=[
{'role': 'system', 'content': 'You are a math teacher. Please conduct step-by-step reasoning and use {} to represent the final answer. Return the text in Markdown format.'},
{'role': 'user', 'content': 'Please solve the question:%s' % (title_text, )}],
)
math_result = completion.model_dump_json()
result = json.loads(math_result)['choices'][0]['message']['content']
print(result)
To determine how many non-zero digits are to the right of the decimal point in the decimal representation of \(\frac{123456789}{100}\), we can follow these steps:
1. **Perform the division**: Divide 123456789 by 100.
\[
\frac{123456789}{100} = 1234567.89
\]
2. **Identify the digits to the right of the decimal point**: In the decimal representation 1234567.89, the digits to the right of the decimal point are 8 and 9.
3. **Count the non-zero digits**: The digits 8 and 9 are both non-zero. Therefore, there are 2 non-zero digits to the right of the decimal point.
Thus, the correct answer is \(\boxed{A}\).